Project

General

Profile

Activity

From 10/14/2022 to 11/12/2022

11/12/2022

04:18 PM Bug #58013 (New): Osdmap too big lead osd crash
My cluster has failed, a large number of osds cannot be started,
and the troubleshooting found that the size of osd...
伟杰 谭
03:40 PM Bug #58012: OpTracker event duration calculation errror
with https://github.com/ceph/ceph/pull/48860 applied:
v2:...
Honggang Yang
03:05 PM Bug #58012 (Duplicate): OpTracker event duration calculation errror
h1. ceph version... Honggang Yang

11/11/2022

08:31 PM RADOS Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
Also to note: We set `ceph config set mgr mgr_stats_period 1` on the gibba cluster to reproduce this bug. (This occur... Laura Flores
07:00 PM devops Bug #56411 (Closed): Workaround for ceph-mgr breaks Cython builds
Adam Emerson
06:27 PM RADOS Bug #49689: osd/PeeringState.cc: ceph_abort_msg("past_interval start interval mismatch") start
I think https://tracker.ceph.com/issues/49689#note-31 makes sense and the following logs also show what max_oldest_ma... Neha Ojha
05:50 PM Feature #58010 (New): Add the OpenSSF Scorecard Action
Hey, I'm Pedro and I'm working for Google and the "OpenSSF":https://openssf.org/ to improve the supply-chain security... Pedro Nacht
03:09 PM Dashboard Bug #57987 (In Progress): mgr/dashboard: missing data on hosts Grafana dashboard
Tatjana Dehler
02:11 PM CephFS Support #57952: Pacific: the buffer_anon_bytes of ceph-mds is too large
Venky Shankar wrote:
> xianpao chen wrote:
> > Venky Shankar wrote:
> > > Could you share the output of
> > >
>...
xianpao chen
01:02 PM CephFS Support #57952: Pacific: the buffer_anon_bytes of ceph-mds is too large
xianpao chen wrote:
> Venky Shankar wrote:
> > Could you share the output of
> >
> > [...]
> >
> > Also, does...
Venky Shankar
10:52 AM Dashboard Tasks #58009 (Resolved): mgr/dashboard: style cards on the page
h3. Description
Currently on the dashboard revamp we are placing the cards within a Bootstrap grid of two rows w...
Pedro González Gómez
10:08 AM RADOS Backport #58007: pacific: bail from handle_command() if _generate_command_map() fails
please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/48846
ceph-backport.sh versi...
nikhil kshirsagar
09:07 AM RADOS Backport #58007 (Resolved): pacific: bail from handle_command() if _generate_command_map() fails
https://github.com/ceph/ceph/pull/48846 Backport Bot
10:03 AM RADOS Backport #58006: quincy: bail from handle_command() if _generate_command_map() fails
please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/48845
ceph-backport.sh versi...
nikhil kshirsagar
09:07 AM RADOS Backport #58006 (Resolved): quincy: bail from handle_command() if _generate_command_map() fails
https://github.com/ceph/ceph/pull/48845 Backport Bot
09:14 AM CephFS Bug #58008: mds/PurgeQueue: don't consider filer_max_purge_ops when _calculate_ops
When increasing filer_max_purge_ops on a pacific version mds, pq_executing_ops/pq_executing_ops_high_water of purge_q... yixing hao
09:13 AM CephFS Bug #58008 (Resolved): mds/PurgeQueue: don't consider filer_max_purge_ops when _calculate_ops
_calculate_ops relying on a config which can be modified on the fly will cause a bug. e.g.
# A file has 20 objects...
yixing hao
09:01 AM RADOS Bug #57859 (Pending Backport): bail from handle_command() if _generate_command_map() fails
PR https://github.com/ceph/ceph/pull/48044 has been merged in main. Ponnuvel P
06:46 AM Bug #57973: rook:rook module failed to connect k8s api server because of self-signed cert with se...
It seems due to bad k8s cert trust chain. ceph is fine. This bug could be closed. Ben Gao
01:01 AM rgw Bug #57562: multisite replication issue on Quincy
It should, thank you. I don't think it's the underlying cause, but it's a good catch. Adam Emerson

11/10/2022

11:57 PM crimson Bug #58005 (Resolved): release-built osd failed to mkfs
It seems that when "seastar::need_preempt" is true, "crimson::do_for_each" will turn into a long recursive function. ... Xuehan Xu
11:37 PM RADOS Bug #56101 (Fix Under Review): Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function s...
Laura Flores
11:21 PM RADOS Bug #56101 (In Progress): Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_t...
Laura Flores
04:52 AM RADOS Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
Thanks for your work in capturing the core Laura.
I had a look at the coredump and it shows exactly what we had sp...
Brad Hubbard
09:06 PM rgw Bug #57562: multisite replication issue on Quincy
A potential bug?
https://github.com/ceph/ceph/blob/main/src/cls/fifo/cls_fifo_types.h#L66
Should it be the follow...
Jane Zhu
07:14 PM RADOS Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
/a/yuriw-2022-10-17_17:31:25-rados-wip-yuri7-testing-2022-10-17-0814-distro-default-smithi/7071031 Laura Flores
07:09 PM Orchestrator Bug #57311: rook: ensure CRDs are installed first
/a/yuriw-2022-10-17_17:31:25-rados-wip-yuri7-testing-2022-10-17-0814-distro-default-smithi/7070926 Laura Flores
05:11 PM Orchestrator Backport #58004 (In Progress): quincy: rook/k8s: nfs cluster creation ends up with no daemons dep...
Juan Miguel Olmo Martínez
05:00 PM Orchestrator Backport #58004 (Resolved): quincy: rook/k8s: nfs cluster creation ends up with no daemons deploy...
https://github.com/ceph/ceph/pull/48830 Backport Bot
04:54 PM Orchestrator Bug #57954 (Pending Backport): rook/k8s: nfs cluster creation ends up with no daemons deployment
Juan Miguel Olmo Martínez
03:44 PM Fix #58003 (Pending Backport): mon: add exception handling to ceph health mute
Running ceph health mute with an invalid TTL causes the mon to crash, because the exception thrown by parse_timespan(... Daniel R
03:25 PM rgw Bug #57706 (Need More Info): When creating a new user, if the 'uid' is not provided, error report...
Hi Kevin Wang,
Could I get what version of Ceph this issue occurred on? The issue does seem to be resolved in the ...
Ali Maredia
03:07 PM rgw Bug #57724 (Fix Under Review): Keys returned by Admin API during user creation on secondary zone ...
Casey Bodley
01:47 PM Orchestrator Bug #58001 (Fix Under Review): haproxy targets are not updated correctly in prometheus.yaml file
Redouane Kachach Elhichou
09:57 AM Orchestrator Bug #58001 (Resolved): haproxy targets are not updated correctly in prometheus.yaml file
steps to reproduce the issue:
1) Bootstrap a new cluster (with monitoring enabled)
2) Wait until Prometheus is up...
Redouane Kachach Elhichou
01:14 PM Bug #58002 (New): mon_max_pg_per_osd is not checked per OSD
The warning for exceeding mon_max_pg_per_osd seems to be triggered only when the average PG count over all OSDs excee... Frank Schilder
11:50 AM RADOS Bug #57989: test-erasure-eio.sh fails since pg is not in unfound
For some reason, the pool already exist... Nitzan Mordechai
08:44 AM RADOS Bug #57757 (In Progress): ECUtil: terminate called after throwing an instance of 'ceph::buffer::v...
Nitzan Mordechai
08:42 AM RADOS Bug #57618 (Fix Under Review): rados/test.sh hang and pkilled (LibRadosWatchNotifyEC.WatchNotify)
Nitzan Mordechai
08:34 AM RADOS Bug #57618: rados/test.sh hang and pkilled (LibRadosWatchNotifyEC.WatchNotify)
Some of the OSDs stopped due to valgrind errors. This is duplicate of other bug Nitzan Mordechai
08:39 AM RADOS Bug #57751 (Fix Under Review): LibRadosAio.SimpleWritePP hang and pkill
Nitzan Mordechai
08:18 AM CephFS Support #57952: Pacific: the buffer_anon_bytes of ceph-mds is too large
Venky Shankar wrote:
> BTW, are you *not* seeing any "oversized cache" warning for the MDS?
there is no "oversize...
xianpao chen
04:06 AM CephFS Support #57952: Pacific: the buffer_anon_bytes of ceph-mds is too large
BTW, are you *not* seeing any "oversized cache" warning for the MDS? Venky Shankar
02:42 AM CephFS Support #57952: Pacific: the buffer_anon_bytes of ceph-mds is too large
Do you have lots of small files and frequently scan them? Venky Shankar
01:12 AM CephFS Support #57952: Pacific: the buffer_anon_bytes of ceph-mds is too large
Venky Shankar wrote:
> Have you tried running `heap release`?
yes,but it didn't seem to work.
xianpao chen
07:38 AM RADOS Bug #57546: rados/thrash-erasure-code: wait_for_recovery timeout due to "active+clean+remapped+la...
Thanks for taking a look Radek! That's a good point since we are seeing this issue with rados/thrash-erasure-code tes... Aishwarya Mathuria
01:45 AM CephFS Bug #58000: mds: switch submit_mutex to fair mutex for MDLog
From Patrick's comment in https://github.com/ceph/ceph/pull/44180#pullrequestreview-1174516711. Xiubo Li
01:44 AM CephFS Bug #58000 (Resolved): mds: switch submit_mutex to fair mutex for MDLog
The implementations of the Mutex (e.g. std::mutex in C++) do not
guarantee fairness, they do not guarantee that the ...
Xiubo Li

11/09/2022

10:56 PM RADOS Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
Managed to reproduce this on the Gibba cluster and produce a coredump!
The core file is located on gibba001 under ...
Laura Flores
10:11 PM rgw Bug #57706: When creating a new user, if the 'uid' is not provided, error reported as 'Permission...
On a branch close to the master branch from a vstart cluster when I try this same scenario I see:
[ali@acadia buil...
Ali Maredia
09:51 PM rgw Bug #57562: multisite replication issue on Quincy
We also found a place that might potentially cause issues.
Rgw locks the mutex and gets some data from "info" befo...
Jane Zhu
09:22 PM rgw Bug #57562: multisite replication issue on Quincy
Here is some more detailed explanation on how the -EINVAL(-22) error (hence datalog writing failure) happens based on... Jane Zhu
08:18 PM RADOS Backport #57704 (Resolved): quincy: mon/MonMap.h: FAILED ceph_assert(m < ranks.size()) when reduc...
https://github.com/ceph/ceph/pull/48321 Kamoltat (Junior) Sirivadhna
08:17 PM RADOS Backport #57705 (Resolved): pacific: mon/MonMap.h: FAILED ceph_assert(m < ranks.size()) when redu...
https://github.com/ceph/ceph/pull/48320 Kamoltat (Junior) Sirivadhna
08:17 PM RADOS Bug #50089 (Resolved): mon/MonMap.h: FAILED ceph_assert(m < ranks.size()) when reducing number of...
Kamoltat (Junior) Sirivadhna
07:08 PM CephFS Feature #57090 (Fix Under Review): MDSMonitor,mds: add MDSMap flag to prevent clients from connec...
Dhairya Parmar
06:23 PM Orchestrator Bug #57999 (Resolved): cephadm: cephadm always reports new or changed devices even if devices are...
This appears to be an issue with the "created" field changing, which should not affect equality in this case... Adam King
06:19 PM Orchestrator Bug #57998: cephadm stuck trying to download "mon"
hmm, can I see what "ceph config dump" spits out (feel free to remove anything sensitive if if necessary)? All the im... Adam King
10:51 AM Orchestrator Bug #57998 (Resolved): cephadm stuck trying to download "mon"
Entire cluster cephadm management is stuck and repeatedly tries to download an unqualified "mon" instead of the ceph ... Shawn Iverson
04:34 PM RADOS Bug #51729: Upmap verification fails for multi-level crush rule
Thanks again for looking at this.
I haven't looked further, but I suspect the issue will come down to the variable...
Chris Durham
01:22 PM CephFS Support #57952: Pacific: the buffer_anon_bytes of ceph-mds is too large
Have you tried running `heap release`? Venky Shankar
09:35 AM CephFS Support #57952: Pacific: the buffer_anon_bytes of ceph-mds is too large
Venky Shankar wrote:
> Could you share the output of
>
> [...]
>
> Also, does running
>
> [...]
>
> redu...
xianpao chen
09:23 AM CephFS Support #57952: Pacific: the buffer_anon_bytes of ceph-mds is too large
Venky Shankar wrote:
> Could you share the output of
>
> [...]
>
> Also, does running
>
> [...]
>
> redu...
xianpao chen
08:56 AM CephFS Support #57952: Pacific: the buffer_anon_bytes of ceph-mds is too large
Could you share the output of... Venky Shankar
08:32 AM crimson Bug #57990: Crimson OSD crashes when trying to bring it up
Crimson is not production ready yet, and there will be no backport to Quincy.
It is expected that there were bugs ...
Yingxin Cheng
07:46 AM Dashboard Backport #57983 (In Progress): quincy: mgr/dashboard: error message displaying when editing journ...
Pedro González Gómez
07:43 AM Dashboard Backport #57982 (In Progress): pacific: mgr/dashboard: error message displaying when editing jour...
Pedro González Gómez
03:01 AM Backport #57997 (In Progress): quincy: ceph-crash service should run as unprivileged user, not ro...
Tim Serong
02:18 AM Backport #57997 (Resolved): quincy: ceph-crash service should run as unprivileged user, not root ...
https://github.com/ceph/ceph/pull/48805 Backport Bot
02:58 AM Backport #57996 (In Progress): pacific: ceph-crash service should run as unprivileged user, not r...
Tim Serong
02:18 AM Backport #57996 (Resolved): pacific: ceph-crash service should run as unprivileged user, not root...
https://github.com/ceph/ceph/pull/48804 Backport Bot
01:54 AM Bug #57967 (Pending Backport): ceph-crash service should run as unprivileged user, not root (CVE-...
Tim Serong

11/08/2022

09:23 PM RADOS Bug #57017: mon-stretched_cluster: degraded stretched mode lead to Monitor crash
pacific backport: https://github.com/ceph/ceph/pull/48803 Kamoltat (Junior) Sirivadhna
08:59 PM RADOS Bug #57017: mon-stretched_cluster: degraded stretched mode lead to Monitor crash
quincy backport: https://github.com/ceph/ceph/pull/48802 Kamoltat (Junior) Sirivadhna
09:20 PM bluestore Feature #57785: fragmentation score in metrics
@Vikhyat, no worries. Based on Kevin's comment, I think this metric might be better suited for Prometheus than Teleme... Laura Flores
06:37 PM bluestore Feature #57785: fragmentation score in metrics
Laura - sorry I missed the update. Can you please ping Adam and Igor? Vikhyat Umrao
08:59 PM rbd Bug #57941: Severe performance drop after writing 100 GB of data to RBD volume, dependent on RAM ...
Thanks for looking into this Christopher. You are right, this is a 100% sequential workload, just filling a volume wi... Guillaume Pothier
08:23 PM rbd Bug #57941 (In Progress): Severe performance drop after writing 100 GB of data to RBD volume, dep...
Christopher Hoffman
08:23 PM rbd Bug #57941: Severe performance drop after writing 100 GB of data to RBD volume, dependent on RAM ...
I'm not familiar with PVE and how it sets up Ceph. I took a look at your testcase and it appears to be a sequential w... Christopher Hoffman
07:37 PM bluestore Fix #54299 (Need More Info): osd error restart
Igor Fedotov
07:34 PM bluestore Bug #57672 (Duplicate): SSD OSD won't start after high framentation score!
Igor Fedotov
07:27 PM bluestore Bug #53466 (In Progress): OSD is unable to allocate free space for BlueFS
Igor Fedotov
07:23 PM RADOS Bug #51729: Upmap verification fails for multi-level crush rule
I believe I've reproduced the issue using the osdmaps that Chris provided.
First, I used the osdmaptool to run the...
Laura Flores
02:49 PM rgw Bug #57911 (Fix Under Review): Segmentation fault when uploading file with bucket policy on Quincy
Daniel Gryniewicz
02:08 PM RADOS Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
after rechecking the logs it looks like we are taking 2 different versions of smithi01231941-9:head
All chunks with ...
Nitzan Mordechai
05:44 AM RADOS Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
@Laura, thanks for confirm that in the coredump, yes, shard0 also showing that when it get the chunk from bluestore:
...
Nitzan Mordechai
12:07 AM RADOS Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
Brad and I did some more debugging today.
Here is the end of the log associated with the coredump:...
Laura Flores
12:03 PM Orchestrator Bug #57897: ceph mgr restart causes restart of all iscsi daemons in a loop
Adam King wrote:
> this is a painful one. @David at least until we have a fix for this, I will mention that setting ...
David Heap
09:42 AM Bug #57956: Ceph monitors in crash loop
liu jun wrote:
> Creating a pool causes mon to restart
>
> This is the detailed question:https://github.com/rook/...
liu jun
09:32 AM Dashboard Backport #57995 (New): quincy: mgr/dashboard: paginate services
Backport Bot
09:31 AM Dashboard Backport #57994 (Rejected): pacific: mgr/dashboard: paginate services
Backport Bot
09:21 AM Dashboard Feature #56512 (Pending Backport): mgr/dashboard: paginate services
Pere Díaz Bou
08:53 AM Support #57992: Stuck in linking when I comiple the CEPH
And when I ctrl+c to stop it and restart it with ninja, it will stuck in "dashboard nodeenv is being installed". Wenyu Huang
02:14 AM Support #57992 (New): Stuck in linking when I comiple the CEPH
I follow the README to compile the CEPH on Github(https://github.com/ceph/ceph/tree/v17.2.5#readme). When I ninja the... Wenyu Huang
08:39 AM Orchestrator Feature #51971 (New): cephadm/ingress: update keepalived container image
Reopning to check/address some of the concerns about the current keepalived image Redouane Kachach Elhichou
05:00 AM Dashboard Backport #57993 (New): quincy: mgr/dashboard: Improve level AA color contrast accessibility for d...
Backport Bot
04:51 AM Dashboard Bug #56023 (Pending Backport): mgr/dashboard: Improve level AA color contrast accessibility for d...
Nizamudeen A
01:20 AM Orchestrator Documentation #57991 (New): Migration documentation about osd service
The documentation doesn't mention how to make the osds in the cluster managed, nor how to add new osds. A new cephadm... Kevin Fox
12:24 AM crimson Bug #57990 (New): Crimson OSD crashes when trying to bring it up
Hello,
Using the `crimson-osd` Ubuntu package for Quincy, we're seeing somewhat recurrent crashes when trying to b...
Luciano Lo Giudice

11/07/2022

09:45 PM rgw Bug #57562: multisite replication issue on Quincy
> I think if the create_part is made exclusive, one of them would fail at part creation and let the other complete pa... Oguzhan Ozmen
09:27 PM RADOS Bug #57977: osd:tick checking mon for new map
Radoslaw Zarzynski wrote:
> Octopus is EOL. Does it happen on a supported release?
>
> Regardless of that, could ...
yite gu
06:13 PM RADOS Bug #57977 (Need More Info): osd:tick checking mon for new map
Octopus is EOL. Does it happen on a supported release?
Regardless of that, could you please provide logs from this...
Radoslaw Zarzynski
07:30 PM RADOS Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
Also to note, we can see information about argument `to_read` here:... Laura Flores
07:27 PM RADOS Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
@Nitzan, what do you think about this analysis? Or are there any other frames/locals you'd like me to check? Laura Flores
07:12 PM RADOS Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
Looking at frame 12, I can see that the incorrect length (262144) for shard 0 is evident in the local variable "from"... Laura Flores
06:02 PM RADOS Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
Got it to detect the right symbols with the new build!
I will attempt to analyze this coredump at a deeper level, ...
Laura Flores
03:16 PM RADOS Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
According to Brad, the build needs to be as close to the test branch that originally experienced the crash as possibl... Laura Flores
07:18 PM RADOS Bug #51729: Upmap verification fails for multi-level crush rule
Thanks Chris! @Radek I have been taking some time to analyze this scenario, and will post updates soon. Laura Flores
06:36 PM RADOS Bug #51729: Upmap verification fails for multi-level crush rule
Thanks for the info! Laura, would you mind retaking a look? Radoslaw Zarzynski
06:36 PM RADOS Bug #51729 (New): Upmap verification fails for multi-level crush rule
Radoslaw Zarzynski
06:43 PM RADOS Bug #50219 (Closed): qa/standalone/erasure-code/test-erasure-eio.sh fails since pg is not in reco...
The original issue was caused by a commit in a wip branch being tested, so it's highly unprobable it's a reoccurence.... Radoslaw Zarzynski
06:42 PM RADOS Bug #57989 (New): test-erasure-eio.sh fails since pg is not in unfound
/a/lflores-2022-10-17_18:19:55-rados:standalone-main-distro-default-smithi/7071287... Radoslaw Zarzynski
06:35 PM RADOS Bug #57845: MOSDRepOp::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_OCTOPUS...
Likely it's even a duplicate of https://tracker.ceph.com/issues/52657. Radoslaw Zarzynski
06:28 PM RADOS Bug #52136 (Fix Under Review): Valgrind reports memory "Leak_DefinitelyLost" errors.
Neha Ojha
06:26 PM RADOS Bug #57940 (Duplicate): ceph osd crashes with FAILED ceph_assert(clone_overlap.count(clone)) when...
Looks like a duplicate of 56772. Radoslaw Zarzynski
06:24 PM RADOS Bug #55141: thrashers/fastread: assertion failure: rollback_info_trimmed_to == head
Nitzan Mordechai wrote:
> Radoslaw Zarzynski wrote:
> > Well, just found a new occurance.
> Where can i find it?
...
Radoslaw Zarzynski
06:12 PM RADOS Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
Brad and I ran a reproducer on the gibba cluster (restarting OSDs with `for osd in $(systemctl -l |grep osd|gawk '{pr... Laura Flores
06:01 PM RADOS Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
Is there any news on that? Radoslaw Zarzynski
05:59 PM RADOS Bug #49689: osd/PeeringState.cc: ceph_abort_msg("past_interval start interval mismatch") start
Updated the PR link. Radoslaw Zarzynski
05:37 PM Dashboard Cleanup #57984 (In Progress): mgr/dashboard: Add tooltip
Pedro González Gómez
07:27 AM Dashboard Cleanup #57984 (Resolved): mgr/dashboard: Add tooltip
h3. Description of problem
Add tooltip for '# Local' and '# Remote' columns in rbd mirroring page.
Pedro González Gómez
03:40 PM rbd Feature #57988 (In Progress): [rbd-mirror] checksumming for snapshot-based mirroring
This is similar to an older feature request for journal-based mirroring but for snapshot-based mirroring the hope is ... Ilya Dryomov
01:48 PM CephFS Bug #57985 (Triaged): mds: warning `clients failing to advance oldest client/flush tid` seen with...
Venky Shankar
09:06 AM CephFS Bug #57985 (Pending Backport): mds: warning `clients failing to advance oldest client/flush tid` ...
https://bugzilla.redhat.com/show_bug.cgi?id=2134709
Generally seen when the MDS is heavily loaded with I/Os. Inter...
Venky Shankar
01:16 PM Dashboard Bug #57987 (Resolved): mgr/dashboard: missing data on hosts Grafana dashboard
h3. Description of problem
A lot of data is missing on the hosts Grafana dashboard (host-detail) and an error mess...
Tatjana Dehler
09:51 AM Linux kernel client Bug #57986: ceph: ceph_fl_release_lock cause "unable to handle kernel paging request at fffffffff...
There should be a race in 'filp_close()`, for example in a single process a file is opened twice with two different f... Xiubo Li
09:48 AM Linux kernel client Bug #57986 (Resolved): ceph: ceph_fl_release_lock cause "unable to handle kernel paging request a...
... Xiubo Li
09:37 AM Linux kernel client Bug #57686 (Fix Under Review): general protection fault and CephFS kernel client hangs after MDS ...
The patchwork: https://patchwork.kernel.org/project/ceph-devel/patch/20221107071759.32000-1-xiubli@redhat.com/
<pr...
Xiubo Li
06:54 AM Linux kernel client Bug #57686 (In Progress): general protection fault and CephFS kernel client hangs after MDS failover
Xiubo Li
09:23 AM Bug #57976: ceph-volume lvm activate removes /var/lib/ceph/osd/ceph-XXX folder and then chokes on...
Looks like the problem is gone after a full reboot. No idea what was going on, but it was reproducible on all nodes. Janek Bevendorff
07:22 AM Linux kernel client Bug #57898: ceph client extremely slow kernel version between 5.15 and 6.0
Minjong Kim wrote:
> https://gist.github.com/caffeinism/dbfd974374d620911a6c0c3dd1daadfb
>
> I am not good at wri...
Xiubo Li
06:54 AM Linux kernel client Bug #57817 (Duplicate): general protection fault and CephFS kernel client hangs after MDS failover
This is exactly the same issue with tracker#57686. Xiubo Li
06:37 AM Dashboard Backport #57983 (Resolved): quincy: mgr/dashboard: error message displaying when editing journam ...
https://github.com/ceph/ceph/pull/48807 Backport Bot
06:37 AM Dashboard Backport #57982 (Resolved): pacific: mgr/dashboard: error message displaying when editing journam...
https://github.com/ceph/ceph/pull/48806 Backport Bot
06:25 AM Dashboard Bug #57922 (Pending Backport): mgr/dashboard: error message displaying when editing journam mirro...
Nizamudeen A
06:23 AM Backport #57981 (New): quincy: ceph-mixin: Add Prometheus Alert for Degraded Bond
Backport Bot
06:23 AM rgw Bug #57980: rgw/cloud-transition: transition fails when using MCG Azure Namespacestore with a pre...
Few observations:
- 2022-11-03T08:42:29.718+0000 7fa1bf7e6640 0 lifecycle: ERROR: failed to check object on the ...
Soumya Koduri
06:21 AM rgw Bug #57980 (Pending Backport): rgw/cloud-transition: transition fails when using MCG Azure Namesp...
Reported by - dparkes@redhat.com
>>>>
Found Errors during cloud transition when using MCG Azure Namespacestore wit...
Soumya Koduri
06:18 AM Feature #57962 (Pending Backport): ceph-mixin: Add Prometheus Alert for Degraded Bond
Nizamudeen A
06:07 AM rgw Bug #57979 (Pending Backport): rgw/cloud-tranistion: Issues with MCG cloud endpoint
Below issues were observed while testing cloud-transition feature using MCG (Noobaa) endpoint
1) Creation of targe...
Soumya Koduri
04:35 AM Bug #57966: Ceph cluster osds failed when ms_cluster_type=async+rdma is used
the same problem on ceph 17.2.5:
root@ceph01:~# ceph crash info 2022-11-07T03:29:36.731174Z_bb6f8fea-ea87-4f83-a28a-...
guoguo jie
01:08 AM RADOS Bug #57937: pg autoscaler of rgw pools doesn't work after creating otp pool
Is there any updates? Please let me know if I can do something. Satoru Takeuchi

11/06/2022

02:27 PM Dashboard Feature #57978 (Fix Under Review): mgr/dashboard: allow to get/update RBD image metadata via REST...
h3. Description of problem
Currently we are missing an ability to get/update RBD image metadata via REST API. We c...
Mykola Golub
05:47 AM RADOS Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
@brad, maybe it's a good candidate for another blog for upstream core dump analysis that you talked about (ubuntu 20.04) Nitzan Mordechai

11/05/2022

02:21 PM Orchestrator Bug #57954: rook/k8s: nfs cluster creation ends up with no daemons deployment
*PR*: https://github.com/ceph/ceph/pull/48694
It has been approved.
Ben Gao

11/04/2022

08:27 PM Dashboard Feature #43264 (Resolved): mgr/dashboard: audit current WCAG 2.1 support [accessibility - a11y]
I think we can resolve this tracker because of Sedrick's work. If more accessibility improvements need to happen in t... Laura Flores
07:48 PM CephFS Bug #49132: mds crashed "assert_condition": "state == LOCK_XLOCK || state == LOCK_XLOCKDONE",
Alternative fix is available at https://github.com/ceph/ceph/pull/48743 Igor Fedotov
07:21 PM RADOS Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
@Brad do you have any tips on how to load the correct debug symbols for the above coredump? After running the `ceph-d... Laura Flores
05:48 PM RADOS Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
No luck yet, but I'm trying to set up the right debug environment. So far, gdb is only giving me question marks, but ... Laura Flores
06:10 AM RADOS Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
Laura, are you able to use GDB with debuginfo on that coredump file? Nitzan Mordechai
04:18 PM RADOS Bug #57977 (Pending Backport): osd:tick checking mon for new map
ceph version: 15.2.7
my cluster have a osd down, and it unable join the osdmap....
yite gu
03:00 PM rgw Bug #57911 (In Progress): Segmentation fault when uploading file with bucket policy on Quincy
Daniel Gryniewicz
12:45 PM Orchestrator Bug #57897: ceph mgr restart causes restart of all iscsi daemons in a loop
this is a painful one. @David at least until we have a fix for this, I will mention that setting the iscsi spec to un... Adam King
12:29 PM Orchestrator Bug #57897: ceph mgr restart causes restart of all iscsi daemons in a loop
In this case I think it would be helpful to see what's the actual content of the deps. To get this information, from ... Redouane Kachach Elhichou
11:24 AM Dashboard Subtask #39668 (In Progress): mgr/dashboard: REST API: improve query syntax: pagination, filterin...
Ernesto Puerta
11:23 AM mgr Feature #45264 (Duplicate): mgr: create new module for exposing ceph-mgr python API via CLI
Ernesto Puerta
10:50 AM Bug #57976 (New): ceph-volume lvm activate removes /var/lib/ceph/osd/ceph-XXX folder and then cho...
When I create a new OSD and try to activate it, the activation step removes the@ /var/lib/ceph/osd/ceph-XXX@ mount fo... Janek Bevendorff
09:17 AM RADOS Feature #48392: ceph ignores --keyring?
This issue is still present in Pacific. Is there any way to work around it except for moving the keys to /etc/ceph?
...
Janek Bevendorff
08:54 AM CephFS Backport #57974 (In Progress): pacific: cephfs-top: make cephfs-top display scrollable like top
Jos Collin
08:46 AM CephFS Backport #57974 (Resolved): pacific: cephfs-top: make cephfs-top display scrollable like top
https://github.com/ceph/ceph/pull/48734 Jos Collin
08:42 AM Bug #57973: rook:rook module failed to connect k8s api server because of self-signed cert with se...
I am working on it. Ben Gao
08:41 AM Bug #57973 (New): rook:rook module failed to connect k8s api server because of self-signed cert w...
steps to reproduce:
1, with rook deploy ceph on k8s cluster
2, run the following to enable rook as orchestrator
...
Ben Gao
05:25 AM Dashboard Cleanup #57972 (Resolved): mgr/dashboard: update jest to 28
Nizamudeen A
04:04 AM Dashboard Backport #57469 (In Progress): quincy: mgr/dashboard: Improve accessibility for navigation compon...
Ngwa Sedrick Meh
03:51 AM CephFS Backport #57971 (Resolved): pacific: cephfs-top: new options to limit and order-by
https://github.com/ceph/ceph/pull/49303 Backport Bot
03:50 AM CephFS Backport #57970 (Resolved): quincy: cephfs-top: new options to limit and order-by
https://github.com/ceph/ceph/pull/50151 Backport Bot
03:25 AM CephFS Feature #55121 (Pending Backport): cephfs-top: new options to limit and order-by
Jos Collin

11/03/2022

08:56 PM RADOS Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
By the way, I have the coredump saved on the teuthology node under /home/lflores/tracker_57757. Laura Flores
03:31 PM RADOS Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
The output Nitzan pasted is from printing ECBackend::read_result_t:
src/osd/ECBackend.cc...
Laura Flores
03:23 PM RADOS Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
Perhaps there is somewhere that the length should be getting updated, but it not? Laura Flores
08:34 PM Bug #57228 (Closed): BoostConfig cmake error affects "make check arm64" main PRs
No longer relevant. Laura Flores
07:11 PM rgw Bug #57562: multisite replication issue on Quincy
We are still testing the latest evidence (HEAD at https://github.com/ceph/ceph/commit/cfc3bde36dbc9c6e0b7182bbb325390... Oguzhan Ozmen
07:02 PM rgw Bug #57936 (Fix Under Review): 'radosgw-admin bucket chown' doesn't set bucket instance owner or ...
Daniel Gryniewicz
02:16 PM rgw Bug #57936 (In Progress): 'radosgw-admin bucket chown' doesn't set bucket instance owner or unlin...
Casey Bodley
02:27 PM RADOS Bug #57969 (New): monitor: ceph -s shows all monitors out of quorum for < 1s
Ceph -s UI shows all monitors out of quorum for a very short time < 1s.
Issue is like to have no real effect on the ...
Kamoltat (Junior) Sirivadhna
02:13 PM rgw Bug #57968 (New): Partial fix for XML responses returning different order of XML elements
Hi
This is a follow up on original problem reported here
https://tracker.ceph.com/issues/52027
I've added my com...
Daniel Iwan
02:13 PM rgw Bug #57951 (Fix Under Review): rgw: lc: lc for a single large bucket can run too long
Casey Bodley
02:03 PM rgw Bug #57724 (In Progress): Keys returned by Admin API during user creation on secondary zone not v...
Casey Bodley
12:45 PM CephFS Feature #44455 (In Progress): cephfs: add recursive unlink RPC
Patrick Donnelly
11:55 AM Bug #57945: On Rocky Os mgr could not start do to wrong python version
Ilya Dryomov wrote:
> Is pacific affected as well? For now, I have set Backport to just quincy based on the origina...
Shimon Tanny
10:53 AM Bug #57945: On Rocky Os mgr could not start do to wrong python version
Is pacific affected as well? For now, I have set Backport to just quincy based on the original title of the issue. Ilya Dryomov
10:52 AM Bug #57945 (Fix Under Review): On Rocky Os mgr could not start do to wrong python version
Ilya Dryomov
09:30 AM CephFS Feature #57090 (In Progress): MDSMonitor,mds: add MDSMap flag to prevent clients from connecting
Dhairya Parmar
07:34 AM CephFS Feature #57090: MDSMonitor,mds: add MDSMap flag to prevent clients from connecting
Patrick Donnelly wrote:
> Dhairya, status on this?
Hi Patrick, i'm on this completely now. Will try bring somethi...
Dhairya Parmar
09:20 AM CephFS Bug #56270: crash: File "mgr/snap_schedule/module.py", in __init__: self.client = SnapSchedClient...
If you're running into this bug after upgrading from Pacific to Quincy, you can manually delete the legacy schedule D... Andreas Teuchert
08:49 AM CephFS Bug #56270: crash: File "mgr/snap_schedule/module.py", in __init__: self.client = SnapSchedClient...
{"log":"debug 2022-11-03T08:38:12.502+0000 7f46270f5700 -1 mgr load Failed to construct class in 'snap_schedule'\n","... Alexander Mamonov
08:46 AM CephFS Bug #56270: crash: File "mgr/snap_schedule/module.py", in __init__: self.client = SnapSchedClient...
How to fix it? Alexander Mamonov
08:50 AM rgw Bug #44660: Multipart re-uploads cause orphan data
As it was discussed in [1] there is already a wip PR with more generic solution [2].
[1] https://github.com/ceph/c...
Mykola Golub
05:24 AM Bug #57967 (Fix Under Review): ceph-crash service should run as unprivileged user, not root (CVE-...
Tim Serong
05:11 AM Bug #57967 (Resolved): ceph-crash service should run as unprivileged user, not root (CVE-2022-3650)
As reported at https://www.openwall.com/lists/oss-security/2022/10/25/1, ceph-crash runs as root, which makes it vuln... Tim Serong
02:46 AM Bug #57966 (New): Ceph cluster osds failed when ms_cluster_type=async+rdma is used
Currently, using iboip can run normally:
The steps are as follows:
Check cluster health:
Ceph health detail.
Ceph...
guoguo jie
02:42 AM RADOS Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
Hi Brad, thanks for all the pointers on the tracker!
I went through the code with Josh and Radek after looking at yo...
Aishwarya Mathuria

11/02/2022

08:27 PM rgw Feature #57965 (Resolved): Add new zone option to control whether an object's first data stripe i...
Delete requests are quite slow on clusters that have a data pool backed by HDDs, especially with an EC pool. For exam... Cory Snyder
06:58 PM rgw Bug #57562: multisite replication issue on Quincy
Yeah, both those commits are gone, make sure you only have the newest one. Adam Emerson
06:33 PM rgw Bug #57562: multisite replication issue on Quincy
Adam Emerson wrote:
> Pushed a new version with what should be a fix for multi-thread and multi-client races.
We ...
Oguzhan Ozmen
07:22 AM rgw Bug #57562: multisite replication issue on Quincy
Pushed a new version with what should be a fix for multi-thread and multi-client races. Adam Emerson
04:09 PM Bug #57964 (New): Cephadm: MONs are not getting back after /var/log filesystem is full
I had at least two examples where Cephadm controlled cluster had /var/log full in CentOS Linux and it obviously cause... Piotr Parczewski
03:53 PM RADOS Fix #57963 (Fix Under Review): osd: Misleading information displayed for the running configuratio...
With the fix, the following is shown for an OSD with ssd as the underlying device type:... Sridhar Seshasayee
03:26 PM RADOS Fix #57963: osd: Misleading information displayed for the running configuration of osd_mclock_max...
See BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2111282 for additional information. Sridhar Seshasayee
03:25 PM RADOS Fix #57963 (Resolved): osd: Misleading information displayed for the running configuration of osd...
For the inactive device type(hdd/ssd) of an OSD, the running configuration option osd_mclock_max_capacity_iops_[hdd|s... Sridhar Seshasayee
03:10 PM ceph-volume Bug #57918 (Fix Under Review): CEPHADM_REFRESH_FAILED: failed to probe daemons or devices
Guillaume Abrioux
01:40 PM ceph-volume Bug #57918 (In Progress): CEPHADM_REFRESH_FAILED: failed to probe daemons or devices
Guillaume Abrioux
01:48 PM Feature #57962 (Pending Backport): ceph-mixin: Add Prometheus Alert for Degraded Bond
Currently there is no alert for a network interface card to be misconfigured or failed which is part of a network bon... Christian Kugler
12:51 PM Orchestrator Bug #57897: ceph mgr restart causes restart of all iscsi daemons in a loop
Attempts to stop or redeploy the daemon don't work as they seem to invoke a dependency check, which then restarts the... David Heap
10:48 AM Orchestrator Bug #57960: iscsi - rbd-target-api unkillable on container exit, daemon enters error state
The systemd service shows the following:... David Heap
10:20 AM Orchestrator Bug #57960 (New): iscsi - rbd-target-api unkillable on container exit, daemon enters error state
Hi,
We have seen lots of iscsi container restarts due to https://tracker.ceph.com/issues/57897 and during some of ...
David Heap
09:04 AM Dashboard Bug #57959 (New): mgr/dashboard: OSD create form without orchestrator error
h3. Description of problem
If you deploy a ceph-dev cluster on main and you try to click on "create" inside Cluste...
Pere Díaz Bou
08:07 AM rgw Bug #57942: rgw leaks rados objects when a part is submitted multiple times in a multipart upload
FI pull-request https://github.com/ceph/ceph/pull/48704 Peter Goron
06:40 AM RADOS Bug #57533 (Fix Under Review): Able to modify the mclock reservation, weight and limit parameters...
Sridhar Seshasayee

11/01/2022

08:22 PM Orchestrator Bug #57897: ceph mgr restart causes restart of all iscsi daemons in a loop
We currently have this situation again after an mgr failover for node maintenance/reboots and managed enabled debug l... David Heap
08:12 PM rgw Bug #57942: rgw leaks rados objects when a part is submitted multiple times in a multipart upload
FI Working on https://github.com/pgoron/ceph/commits/fix_rgw_rados_leaks_57942 to fix both issues (index entry leaks ... Peter Goron
07:25 PM rgw Bug #57562: multisite replication issue on Quincy
Agree as you mentioned, the other solution could be, secondary not limited to just listening on to orpan part, but co... Krunal Chheda
06:37 PM rgw Bug #57562: multisite replication issue on Quincy
Ah, I see, I need to update the async lister. Adam Emerson
06:36 PM rgw Bug #57562: multisite replication issue on Quincy
That's the point of the commit `rgw/fifo: `part_full` is not a reliable indicator`. There is no 'orphan part' in that... Adam Emerson
05:39 PM rgw Bug #57562: multisite replication issue on Quincy
Hey Adam,
Just a heads-up we tested with latest commit and we still see the issue.
The issue is seen when running M...
Krunal Chheda
02:11 PM rgw Bug #57562: multisite replication issue on Quincy
Thank you Adam. We'll test with the latest change. Oguzhan Ozmen
06:04 PM Orchestrator Cleanup #57957: cephadm: rename "extra_container_args"
Additionally, the "args" field specific to Custom Containers. I think that "args" is an attractively named option for... John Mulligan
03:32 PM Orchestrator Cleanup #57957 (New): cephadm: rename "extra_container_args"
we ideally want a different name that makes it more clear that these are arguments for the podman/docker run command ... Adam King
04:27 PM rgw Bug #44660 (Fix Under Review): Multipart re-uploads cause orphan data
Actually it looks like there is a simpler solution to this problem, which uses the meta object lock when checking if ... Mykola Golub
02:33 PM Bug #57956 (New): Ceph monitors in crash loop

Creating a pool causes mon to restart
This is the detailed question:https://github.com/rook/rook/issues/10110
...
liu jun
03:43 AM Orchestrator Bug #57954: rook/k8s: nfs cluster creation ends up with no daemons deployment
I am working on this issue Ben Gao
03:42 AM Orchestrator Bug #57954 (Resolved): rook/k8s: nfs cluster creation ends up with no daemons deployment
steps to reproduce:
1, with rook deploy Ceph 17.2.5 on k8s
2, run the following to enable rook as orchestrator with...
Ben Gao
03:32 AM Support #57953 (New): Ceph runs in the openeuler system, and the pool cannot be created after ini...
https://github.com/rook/rook/issues/11242
Ceph runs in the openeuler system, and the pool cannot be created after ...
liu jun
02:32 AM CephFS Support #57952 (New): Pacific: the buffer_anon_bytes of ceph-mds is too large
The buffer_anon_bytes will reach 200+GB, then run out of machine memory.It does not seem to be able to effectively fr... xianpao chen

10/31/2022

06:56 PM rgw Bug #57562: multisite replication issue on Quincy
Pushed a new version that should make listing list all the objects reliably. Adam Emerson
06:33 PM rbd Bug #49947 (Resolved): document supported architectures for PWL cache plugin
Ilya Dryomov
05:32 PM Backport #57925: pacific: common: use fmt::print for stderr logging
Conditional on fmt >= 9 ?
https://tracker.ceph.com/issues/57540
chris denice
05:22 PM Orchestrator Feature #57944: add option to allow for setting extra daemon args for containerized services
We've hit this too. Kevin Fox
04:40 PM rgw Bug #57951 (Pending Backport): rgw: lc: lc for a single large bucket can run too long
If this happens, other lc hosts/threads can attempt to process the same bucket, which inflates overhead without any c... Matt Benjamin
04:07 PM Orchestrator Feature #57948 (Fix Under Review): Adding support for a secure ceph monitoring stack
Redouane Kachach Elhichou
10:05 AM Orchestrator Feature #57948 (Resolved): Adding support for a secure ceph monitoring stack
Right now the communications between the monitoring components (prometheus, alertmanager, node-exporter, ..) is using... Redouane Kachach Elhichou
03:11 PM Dashboard Backport #57691 (Resolved): pacific: mgr/dashboard: permission denied when creating a NFS export
Nizamudeen A
01:58 PM RADOS Bug #53729 (Resolved): ceph-osd takes all memory before oom on boot
Konstantin Shalygin
01:58 PM RADOS Backport #55633 (Rejected): octopus: ceph-osd takes all memory before oom on boot
Octopus is EOL Konstantin Shalygin
12:53 PM rbd Bug #57902 (Fix Under Review): [rbd-nbd] add --snap-id option to "rbd device map" to allow mappin...
Ilya Dryomov
12:41 PM RADOS Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
... Nitzan Mordechai
09:50 AM CephFS Bug #57920: mds:ESubtreeMap event size is too large
Venky Shankar wrote:
> Hi,
>
> Could the list of PRs that try to address this issue be linked? (so, that we don't...
zhikuo du
09:36 AM CephFS Bug #57920: mds:ESubtreeMap event size is too large
Hi,
Could the list of PRs that try to address this issue be linked? (so, that we don't loose track of them).
As...
Venky Shankar
04:50 AM CephFS Bug #57920: mds:ESubtreeMap event size is too large
zhikuo du wrote:
> > I am afraid this won't work. As I remembered from my test before, the size of ESubtreeMap could...
Xiubo Li
02:47 AM CephFS Bug #57920: mds:ESubtreeMap event size is too large
> I am afraid this won't work. As I remembered from my test before, the size of ESubtreeMap could reach up to several... zhikuo du
02:37 AM CephFS Bug #57920: mds:ESubtreeMap event size is too large
zhikuo du wrote:
> > May I ask you a question:
> > What factors decide how many event must have a ESubtreeMap e...
Xiubo Li
09:44 AM rgw Feature #57947 (Pending Backport): Improve performance of multi-object delete by handling individ...
Multi-object deletes are currently quite slow. The handler for this method currently just loops through the list of o... Cory Snyder
08:10 AM rgw Bug #57942: rgw leaks rados objects when a part is submitted multiple times in a multipart upload
After digging more on the issue, I think the root cause is linked to following code:
https://github.com/ceph/ceph/...
Peter Goron
05:04 AM Orchestrator Bug #57800: ceph orch upgrade does not appear to work with FQNDs.
And suddenly the upgrade is happening!!!
Today I rebooted ceph02, a node that only had the MDS, and suddenly thing...
Brian Woods
04:35 AM CephFS Backport #57946 (In Progress): quincy: cephfs-top: make cephfs-top display scrollable like top
Jos Collin
04:26 AM CephFS Backport #57946 (Resolved): quincy: cephfs-top: make cephfs-top display scrollable like top
https://github.com/ceph/ceph/pull/48677 Backport Bot
04:21 AM CephFS Feature #55197 (Pending Backport): cephfs-top: make cephfs-top display scrollable like top
Venky Shankar
03:57 AM RADOS Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
Still trying to run a test with added debugging due to the ongoing infra issues but I noticed that Coverity CID 15096... Brad Hubbard
01:30 AM Dashboard Bug #53230: ceph API tests failed
maybe this is an environmental issue. can close it zhipeng li

10/30/2022

05:14 PM rbd Bug #49947 (Fix Under Review): document supported architectures for PWL cache plugin
Ilya Dryomov
02:18 PM CephFS Bug #57920: mds:ESubtreeMap event size is too large
> @Xiubo Li @Venky Shankar
>
> I readed the codes about: how the segment is trimmed and how ESubtreeMap/EImportSt...
zhikuo du
01:10 PM CephFS Bug #57920: mds:ESubtreeMap event size is too large
> May I ask you a question:
> What factors decide how many event must have a ESubtreeMap event? And what is the...
zhikuo du
09:42 AM Bug #57945 (Pending Backport): On Rocky Os mgr could not start do to wrong python version
Reproduction
Compile and run ceph on Rocky os
run vstart
Result
error in manager process
python module was ...
Shimon Tanny

10/29/2022

10:29 PM Documentation #47656 (Closed): Install Guide - Fed 32 installation instructions don't work
Zac Dover
10:06 PM RADOS Documentation #46126: RGW docs lack an explanation of how permissions management works, especiall...
You thought that copying this rude exchange verbatim was essential to motivate improving the docs?
Matt
Zac Dover
09:45 PM Documentation #44342 (Resolved): Create a notification and a link to direct people to a particula...
We have had this versioned-documentation menu in the docs for some time now. I have attached screenshots that confirm... Zac Dover
09:32 PM Bug #47063 (Resolved): The RADOS deployment guide refers to ceph-deploy, which is, as of Octopus,...
Robert Sander removed ceph/deploy.rst in 2020: 1b42759e19b352c22d9e9109ecdf6c3b20feed84
https://github.com/ceph/ceph...
Zac Dover
09:28 PM Bug #47064 (Resolved): rados/deployment is redundant
Robert Sander removed this in 2020: 1b42759e19b352c22d9e9109ecdf6c3b20feed84
https://github.com/ceph/ceph/commit/1b4...
Zac Dover
05:32 PM Orchestrator Bug #57800: ceph orch upgrade does not appear to work with FQNDs.
I am getting ready to add another node to the cluster. Is there anything you can think of I can check, pre or post? Brian Woods
02:20 AM CephFS Feature #55197 (Resolved): cephfs-top: make cephfs-top display scrollable like top
Jos Collin
01:32 AM rgw Bug #57853: multisite sync process block after long time running
I think something wrong with rgw-coroutine,please check the above PR zhipeng li
01:30 AM rgw Bug #57853: multisite sync process block after long time running
PR https://github.com/ceph/ceph/pull/48626 zhipeng li

10/28/2022

09:42 PM Bug #57914: centos 8 build failed
Note from Dan:
It's looking more and more like we're just going to have to rebuild these broken VMs from scratch. ...
Laura Flores
09:10 PM rgw Bug #57770: RGW (pacific) misplaces index entries after dynamically resharding bucket
J. Eric Ivancich wrote:
> Nick,
>
> I don't know that I have a cluster at my fingertips that might be necessary t...
Nick Janus
07:23 PM rgw Bug #57770: RGW (pacific) misplaces index entries after dynamically resharding bucket
Nick,
I don't know that I have a cluster at my fingertips that might be necessary to test this potential fix. How ...
J. Eric Ivancich
07:21 PM rgw Bug #57770 (Fix Under Review): RGW (pacific) misplaces index entries after dynamically resharding...
J. Eric Ivancich
07:16 PM Orchestrator Feature #57944 (Resolved): add option to allow for setting extra daemon args for containerized se...
The ceph orchestrator YML specs for service templates has an option for "extra_container_args" which allows the user ... Wyllys Ingersoll
04:25 PM CephFS Bug #53509 (Resolved): quota support for subvolumegroup
Greg Farnum
04:25 PM CephFS Bug #53848 (Resolved): mgr/volumes: Failed to create clones if the source snapshot's quota is exc...
Greg Farnum
02:37 PM ceph-volume Bug #57907: ceph-volume complains about "Insufficient space (<5GB)" on 1.75TB device
I would need it with hotplug enabled.
anyway, I tried to reproduce...
Guillaume Abrioux
01:08 PM ceph-volume Bug #57907: ceph-volume complains about "Insufficient space (<5GB)" on 1.75TB device
Guillaume Abrioux wrote:
> can you share the output of `ceph-volume inventory --format json` ?
With hotplug disab...
Björn Lässig
12:36 PM ceph-volume Bug #57907: ceph-volume complains about "Insufficient space (<5GB)" on 1.75TB device
can you share the output of `ceph-volume inventory --format json` ? Guillaume Abrioux
01:35 PM Dashboard Bug #57943 (New): doc/radosgw: "waiting on unpkg.com" for upwards of one minute when http://local...
h3. Description
*RADOSGW documentation calls unpkg.com when http://localhost:8080/radosgw/multisite/ is loaded in ...
Zac Dover
12:36 PM rgw Bug #57942 (Duplicate): rgw leaks rados objects when a part is submitted multiple times in a mult...
Hello,
Issue presented below affects all ceph versions at least since 14.2 (reproducer tested on 14.2, 15.2, 16.2,...
Peter Goron
07:11 AM CephFS Backport #57723: pacific: qa: test_subvolume_snapshot_info_if_orphan_clone fails
Backport of https://github.com/ceph/ceph/pull/48642 is also included with this Kotresh Hiremath Ravishankar
01:39 AM bluestore Feature #57785: fragmentation score in metrics
Ultimately, I'd like it in prometheus, so I can setup alerts if it gets too high. Kevin Fox

10/27/2022

09:21 PM rgw Bug #57562: multisite replication issue on Quincy
Pushed a newer, newer fix that guards all calls to _prepare_new_head behind check/set of preparing. Adam Emerson
04:15 PM rgw Bug #57562: multisite replication issue on Quincy
Pushed a newer fix that does the check in need_new_head() Adam Emerson
02:01 PM rgw Bug #57562: multisite replication issue on Quincy
Hi Adam,
We obtained the extra logging with the fix in place.
I think the contention is not within _prepare_ne...
Oguzhan Ozmen
01:09 AM rgw Bug #57562: multisite replication issue on Quincy
I expect there are multiple problems with sync in Quincy, so I don't expect this to actually make sync work.
But i...
Adam Emerson
12:15 AM rgw Bug #57562: multisite replication issue on Quincy
Pulled the changes in on top of the commit _9056dbcdeaa7f4350b54a69f669982358ec5448e_ (on main branch). Unfortunately... Oguzhan Ozmen
08:29 PM rbd Bug #57941 (Rejected): Severe performance drop after writing 100 GB of data to RBD volume, depend...
Write throughput to a mapped RBD volume drops dramatically after the volume reaches a certain usage size. The amount ... Guillaume Pothier
06:07 PM RADOS Bug #57940 (Duplicate): ceph osd crashes with FAILED ceph_assert(clone_overlap.count(clone)) when...
Hi, I have this current crash:
I've experienced a disk failure in my ceph cluster.
I've replaced the disk, but no...
Thomas Le Gentil
05:28 PM bluestore Feature #57785: fragmentation score in metrics
Kevin Fox wrote:
> Currently the bluestore fragmentation score does not seem to be exported in metrics. Due to the i...
Yaarit Hatuka
04:50 PM RADOS Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
@Laura, thanks for that! i'll try first with main as you suggested Nitzan Mordechai
03:32 PM RADOS Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
@Nitzan, here is the branch if you'd like to rebuild it on ci: https://github.com/ljflores/ceph/commits/wip-lflores-t... Laura Flores
10:36 AM RADOS Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
The coredump from branch wip-lflores-testing, I was not able to create docker image since this branch is no longer av... Nitzan Mordechai
02:31 PM rgw Bug #57928 (Duplicate): Octopus:multisite sync process block after long time running
Casey Bodley
02:31 PM rgw Bug #57927 (Duplicate): pacific:multisite sync process block after long time running
Casey Bodley
01:22 PM CephFS Bug #55804 (Duplicate): qa failure: pjd link tests failed
Venky Shankar
01:21 PM CephFS Bug #55804: qa failure: pjd link tests failed
This issue is probably fixed by PR: https://github.com/ceph/ceph/pull/46331 ("mds: wait unlink to finish to avoid con... Venky Shankar
12:55 PM CephFS Bug #57446: qa: test_subvolume_snapshot_info_if_orphan_clone fails
Fixed another possible failure with this test
https://github.com/ceph/ceph/pull/48642
Kotresh Hiremath Ravishankar
12:27 PM CephFS Bug #51278: mds: "FAILED ceph_assert(!segments.empty())"
Venky Shankar wrote:
> Latest occurrence with similar backtrace - https://pulpito.ceph.com/vshankar-2022-06-03_10:03...
Stephen Cuppett
12:23 PM ceph-volume Bug #57939 (New): Not able to add additional disk sharing common wal/db device
Ceph version: 17.2.3 (dff484dfc9e19a9819f375586300b3b79d80034d)
quincy (stable)
Each of our nodes has 8x 16T rot...
Daniel Olsson
12:17 PM RADOS Bug #55141: thrashers/fastread: assertion failure: rollback_info_trimmed_to == head
Radoslaw Zarzynski wrote:
> Well, just found a new occurance.
Where can i find it?
Nitzan Mordechai
12:13 PM RADOS Bug #50042 (In Progress): rados/test.sh: api_watch_notify failures
Nitzan Mordechai
12:12 PM RADOS Bug #52136 (In Progress): Valgrind reports memory "Leak_DefinitelyLost" errors.
Nitzan Mordechai
11:47 AM RADOS Bug #57751 (In Progress): LibRadosAio.SimpleWritePP hang and pkill
Nitzan Mordechai
10:55 AM RADOS Bug #57751: LibRadosAio.SimpleWritePP hang and pkill
This is not an issue with the test, not all the osd are up, and we are waiting (valgrind report memory leak from rock... Nitzan Mordechai
08:24 AM ceph-volume Bug #57907: ceph-volume complains about "Insufficient space (<5GB)" on 1.75TB device
Actually, this is a major bug for me, as i have to reboot the complete host, to replace one OSD. Björn Lässig
07:36 AM rgw Cleanup #57938 (Pending Backport): relying on boost flatmap emplace behavior is risky
see coverity issue: http://folio07.front.sepia.ceph.com/main/ceph-main-98d41855/cov-main-html/3/2253rgw_trim_bilog.cc... Yuval Lifshitz
06:39 AM Orchestrator Bug #57910: ingress: HAProxy fails to start because keepalived IP address not yet available on ne...
Happens also (sometimes?) after re-provisioning an ingress server. After OS installed and when cephadm configures the... Voja Molani
04:26 AM RADOS Bug #57937 (Rejected): pg autoscaler of rgw pools doesn't work after creating otp pool
It's about the following my post to ceph-users ML.
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/threa...
Satoru Takeuchi
02:56 AM CephFS Bug #57920: mds:ESubtreeMap event size is too large
zhikuo du wrote:
> Xiubo Li wrote:
> > zhikuo du wrote:
> > [...]
> > > 4,I think this problem will seriously aff...
zhikuo du

10/26/2022

11:25 PM RADOS Bug #57017 (Pending Backport): mon-stretched_cluster: degraded stretched mode lead to Monitor crash
Neha Ojha
10:01 PM rgw Bug #57936 (Pending Backport): 'radosgw-admin bucket chown' doesn't set bucket instance owner or ...
steps to reproduce:
1. start a vstart cluster and create a bucket as user 'testid'...
Casey Bodley
09:29 PM Orchestrator Bug #57755: task/test_orch_cli: test_cephfs_mirror times out
/a/yuriw-2022-10-19_18:35:19-rados-wip-yuri10-testing-2022-10-19-0810-distro-default-smithi/7074978 Laura Flores
09:18 PM RADOS Bug #52129: LibRadosWatchNotify.AioWatchDelete failed
/a/yuriw-2022-10-19_18:35:19-rados-wip-yuri10-testing-2022-10-19-0810-distro-default-smithi/7074802 Laura Flores
05:08 PM rgw Bug #57562: multisite replication issue on Quincy
Awesome! Thanks for the quick turn around! Will pull and test. Jane Zhu
04:49 PM rgw Bug #57562 (Fix Under Review): multisite replication issue on Quincy
I have a candidate fix at https://github.com/ceph/ceph/pull/48632 Adam Emerson
02:14 PM rgw Bug #57562: multisite replication issue on Quincy
FYI: We pulled in the 2 PRs Casey posted in the tracker https://tracker.ceph.com/issues/57783, and tested again with ... Jane Zhu
12:31 PM rgw Bug #57562: multisite replication issue on Quincy
FWIW, below provides some log snippets with enhanced events. To be specific, some existing log events are added addit... Oguzhan Ozmen
03:47 PM Feature #57109 (Fix Under Review): windows: rbd-wnbd SCSI persistent reservations
Lucian Petrut
02:52 PM RADOS Bug #57883 (Resolved): test-erasure-code.sh: TEST_rados_put_get_jerasure fails on "rados_put_get:...
Laura Flores
02:13 PM CephFS Backport #57717 (Resolved): quincy: libcephfs: incorrectly showing the size for snapdirs when sta...
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/48414
Merged.
Venky Shankar
01:45 PM RADOS Bug #50042: rados/test.sh: api_watch_notify failures
... Nitzan Mordechai
04:58 AM RADOS Bug #50042: rados/test.sh: api_watch_notify failures
I checked all the list_watchers failures (checking size of watch list), It looks like the watcher timed out and that ... Nitzan Mordechai
12:52 PM Orchestrator Bug #57931 (Fix Under Review): RGW rgw_frontend_type field is not checked correctly during the sp...
Redouane Kachach Elhichou
08:23 AM Orchestrator Bug #57931 (Resolved): RGW rgw_frontend_type field is not checked correctly during the spec parsing
A spec with the following type for example:... Redouane Kachach Elhichou
10:46 AM Bug #57934 (New): Pacific 16.2.10 OSD crashing with tcmalloc
# ceph crash info 2022-10-25T14:49:09.609527Z_b10565bb-036d-408f-a536-442f7b4c8213
{
"archived": "2022-10-26 04...
Michal Nasiadka
10:34 AM Bug #57933 (New): Do package update on base images before building image to reduce Vulnerability
*What should the feature do:*
Update the package definition update on base images before building image to install...
Pratik Raj
10:09 AM CephFS Bug #57920: mds:ESubtreeMap event size is too large
Xiubo Li wrote:
> zhikuo du wrote:
> [...]
> > 4,I think this problem will seriously affect the IOPS of write and ...
zhikuo du
03:49 AM CephFS Bug #57920: mds:ESubtreeMap event size is too large
zhikuo du wrote:
> Xiubo Li wrote:
> > zhikuo du wrote:
> > [...]
> > > 4,I think this problem will seriously aff...
Xiubo Li
01:42 AM CephFS Bug #57920: mds:ESubtreeMap event size is too large
Xiubo Li wrote:
> zhikuo du wrote:
> [...]
> > 4,I think this problem will seriously affect the IOPS of write and ...
zhikuo du
12:42 AM CephFS Bug #57920: mds:ESubtreeMap event size is too large
zhikuo du wrote:
[...]
> 4,I think this problem will seriously affect the IOPS of write and read.
>
> 5, @Xiubo ...
Xiubo Li
10:05 AM CephFS Bug #57856 (Closed): cephfs-top: Skip refresh when the perf stats query shows no metrics
Closing this, as refreshes are optimised in a better way in https://github.com/ceph/ceph/pull/48090. Jos Collin
09:02 AM mgr Bug #57932: Intermittent ceph-mgr segfault MgrStandby::ms_dispatch2()
Please let me know if I can provide any more detail. If it's helpful I can provide a crash dump Peter Sabaini
09:00 AM mgr Bug #57932 (Need More Info): Intermittent ceph-mgr segfault MgrStandby::ms_dispatch2()
We're seeing intermittent ceph-mgr segfaults in CI... Peter Sabaini
07:09 AM Linux kernel client Bug #57898: ceph client extremely slow kernel version between 5.15 and 6.0
Also, it seems that requests to mds are much slower than writing blocks. When I run the rm command, it sends an avera... Minjong Kim
06:55 AM Linux kernel client Bug #57898: ceph client extremely slow kernel version between 5.15 and 6.0
Xiubo Li wrote:
> Minjong Kim wrote:
> > ceph I used the ceph kernel mount. In fuse-mount it works fine.
> >
> >...
Minjong Kim
06:32 AM Linux kernel client Bug #57898: ceph client extremely slow kernel version between 5.15 and 6.0
https://gist.github.com/caffeinism/dbfd974374d620911a6c0c3dd1daadfb
I am not good at writing files in a shell scri...
Minjong Kim
06:23 AM Linux kernel client Bug #57898: ceph client extremely slow kernel version between 5.15 and 6.0
But I haven't checked the testing branch. (I'll check) Minjong Kim
06:22 AM Linux kernel client Bug #57898: ceph client extremely slow kernel version between 5.15 and 6.0
Xiubo Li wrote:
> Minjong Kim wrote:
> > ceph I used the ceph kernel mount. In fuse-mount it works fine.
> >
> >...
Minjong Kim
05:59 AM Linux kernel client Bug #57898: ceph client extremely slow kernel version between 5.15 and 6.0
Minjong Kim wrote:
> ceph I used the ceph kernel mount. In fuse-mount it works fine.
>
> The test script is nothi...
Xiubo Li
05:49 AM Linux kernel client Bug #57898: ceph client extremely slow kernel version between 5.15 and 6.0
ceph I used the ceph kernel mount. In fuse-mount it works fine.
The test script is nothing special. I just did the...
Minjong Kim
04:55 AM Linux kernel client Bug #57898: ceph client extremely slow kernel version between 5.15 and 6.0
Could you upload your test script ?
Do you mean you can also reproduce this by using the ceph-fuse mount, right ?
Xiubo Li
04:41 AM Linux kernel client Bug #57898: ceph client extremely slow kernel version between 5.15 and 6.0
Hello again
I don't know if anyone is interested, but when tested with an already built kernel (https://kernel.ubunt...
Minjong Kim
06:25 AM CephFS Backport #57929 (In Progress): quincy: qa: test_dump_loads fails with JSONDecodeError
https://github.com/ceph/ceph/pull/54187 Backport Bot
06:18 AM CephFS Bug #57299 (Pending Backport): qa: test_dump_loads fails with JSONDecodeError
Venky Shankar
06:09 AM RADOS Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
I was able to gather a coredump and set up a binary compatible environment to debug it from this run Laura started in... Brad Hubbard
04:58 AM RADOS Bug #49689: osd/PeeringState.cc: ceph_abort_msg("past_interval start interval mismatch") start
I wrote up an working explanation of PastIntervals in https://github.com/athanatos/ceph/tree/sjust/wip-49689-past-int... Samuel Just
03:48 AM rgw Bug #57853: multisite sync process block after long time running
Quincy、Pacific、Octopus、 Nautilus has same issue zhipeng li
03:31 AM mgr Backport #57077 (Resolved): quincy: "overlapping roots" error message needs documentation
Konstantin Shalygin
03:27 AM rgw Bug #57928 (Duplicate): Octopus:multisite sync process block after long time running
1、deploy RADOSGW multisite
2、put lot of objects
3、keep it runing for a long time
zhipeng li
03:25 AM rgw Bug #57927: pacific:multisite sync process block after long time running
same as https://tracker.ceph.com/issues/57853 zhipeng li
03:24 AM rgw Bug #57927 (Duplicate): pacific:multisite sync process block after long time running
1、deploy RADOSGW multisite
2、put lot of objects
3、keep it runing for a long time
zhipeng li
12:31 AM Backport #57926 (New): quincy: common: use fmt::print for stderr logging
Backport Bot
12:30 AM Backport #57925 (Rejected): pacific: common: use fmt::print for stderr logging
Backport Bot
12:17 AM Cleanup #53682 (Pending Backport): common: use fmt::print for stderr logging
Patrick Donnelly
12:15 AM Bug #57923 (Fix Under Review): log: writes to stderr (pipe) may not be atomic
Patrick Donnelly
12:07 AM RADOS Bug #57845 (New): MOSDRepOp::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_O...
Notes from rados team meeting:
Seems like the same class of bugs we hit in https://tracker.ceph.com/issues/52657 a...
Neha Ojha

10/25/2022

11:50 PM mgr Backport #57077: quincy: "overlapping roots" error message needs documentation
FYI, the corresponding PR (#47519) was merged and so it seems to be OK to close this ticket. Satoru Takeuchi
11:14 PM RADOS Bug #51729: Upmap verification fails for multi-level crush rule
I put together the following contrived example to
illustrate the problem. Again, this is pacific 16.2.9 on rocky8 li...
Chris Durham
10:20 PM rgw Bug #57562 (In Progress): multisite replication issue on Quincy
Small reproducer turned out to not be, but fixing that. Adam Emerson
04:51 PM rgw Bug #57562: multisite replication issue on Quincy
Thank you. Adam Emerson
04:34 PM rgw Bug #57562: multisite replication issue on Quincy
Please see the following FIFO log snippets. And please let me know if you need more.
The creation of data_log.34.n...
Jane Zhu
03:53 PM rgw Bug #57562: multisite replication issue on Quincy
Can we get a more complete log snippet? All the FIFO logging with the relevant TIDs would make tracing what's going o... Adam Emerson
03:12 PM rgw Bug #57562: multisite replication issue on Quincy
thanks, that's very interesting Matt Benjamin
02:59 PM rgw Bug #57562: multisite replication issue on Quincy
We pretty much narrowed down what the problem is: a race condition has been identified in FIFO::_prepare_new_head(..)... Jane Zhu
07:00 PM Orchestrator Bug #57800: ceph orch upgrade does not appear to work with FQNDs.
I rebooted last night, all items report a refreshed time of about 13 hours ago, when I rebooted.... Brian Woods
12:14 PM Orchestrator Bug #57800: ceph orch upgrade does not appear to work with FQNDs.
this seems to imply the cephadm service loop just isn't running at all. Does the REFRESHED column in `ceph orch devic... Adam King
05:34 PM Orchestrator Bug #57917: cephadm: duplicate log entry for /var/log/ceph/cephadm.log
As a workaround, I tried a `prerotate` script:... Michael Fritch
04:08 AM Orchestrator Bug #57917: cephadm: duplicate log entry for /var/log/ceph/cephadm.log
this is also problematic for users upgrading from Octopus based clusters to Pacific. Michael Fritch
04:06 AM Orchestrator Bug #57917 (New): cephadm: duplicate log entry for /var/log/ceph/cephadm.log
Configuration of logrotate for /var/lib/ceph/cephadm.log was added (and backported to Pacific) via this PR:
https://...
Michael Fritch
05:19 PM RADOS Bug #50219 (New): qa/standalone/erasure-code/test-erasure-eio.sh fails since pg is not in recover...
The failure actually reproduced here:
/a/lflores-2022-10-17_18:19:55-rados:standalone-main-distro-default-smithi/7...
Laura Flores
05:06 PM RADOS Bug #57883 (Fix Under Review): test-erasure-code.sh: TEST_rados_put_get_jerasure fails on "rados_...
Laura Flores
02:21 PM RADOS Bug #57883 (In Progress): test-erasure-code.sh: TEST_rados_put_get_jerasure fails on "rados_put_g...
Laura Flores
03:39 PM Dashboard Bug #57924: mgr/dashboard: fails with "Module 'dashboard' has failed: key type unsupported" when ...
the 2nd quoted cert was working. Unfortunately i cannot fix my own bug reports. Björn Lässig
03:38 PM Dashboard Bug #57924: mgr/dashboard: fails with "Module 'dashboard' has failed: key type unsupported" when ...
Certificate that did not work:... Björn Lässig
03:33 PM Dashboard Bug #57924 (New): mgr/dashboard: fails with "Module 'dashboard' has failed: key type unsupported"...
h3. Description of problem
After generating a recent certificate by letsencrypt and configuring dashboard to use t...
Björn Lässig
02:39 PM Documentation #57858 (Resolved): v17.2.4 release does not contain latest cherry-picks
17.2.5 was released with all missing commits Yuri Weinstein
02:19 PM RADOS Bug #57900 (In Progress): mon/crush_ops.sh: mons out of quorum
Laura Flores
02:17 PM RADOS Bug #57900: mon/crush_ops.sh: mons out of quorum
@Radek so the suggestion is to give the mons more time to reboot?
This is the workunit:
https://github.com/ceph/c...
Laura Flores
02:16 PM Bug #57923 (Resolved): log: writes to stderr (pipe) may not be atomic
This can lead to logging from pods like:... Patrick Donnelly
02:09 PM Dashboard Bug #57922 (Resolved): mgr/dashboard: error message displaying when editing journam mirror image
h3. Description of problem
An error message is being displayed when editing journal mirror image
!image1.png!
Pedro González Gómez
12:52 PM CephFS Feature #57090: MDSMonitor,mds: add MDSMap flag to prevent clients from connecting
Dhairya, status on this? Patrick Donnelly
12:18 PM Orchestrator Cleanup #57921 (Resolved): orchestrator: orch upgrade status help message is wrong
Currently the `orch upgrade status` help message is just a copy of the `orch upgrade check` help message... Adam King
09:32 AM CephFS Bug #57920 (New): mds:ESubtreeMap event size is too large
In production environment, we have a problem: The ESubtreeMap event size is too large.
1,The ESubtreeMap event siz...
zhikuo du
07:28 AM ceph-volume Bug #57918: CEPHADM_REFRESH_FAILED: failed to probe daemons or devices
Fixed the issue by removing the unused disk, but an empty disk shouldn't be a issue. Sake Paulusma
06:56 AM ceph-volume Bug #57918 (Resolved): CEPHADM_REFRESH_FAILED: failed to probe daemons or devices
Last friday I upgrade the Ceph cluster successfully from 17.2.3 to 17.2.5 with "ceph orch upgrade start --image local... Sake Paulusma
07:18 AM rgw Bug #57919 (New): bucket can not be resharded after cancelling prior reshard process
Hi,
we run a multisite setup where only the metadata get synced, but not the actual data.
I wanted to reshard a b...
Boris B
05:52 AM rgw Bug #56248: crash: rgw::ARN::ARN(rgw_bucket const&)
Fixed in https://tracker.ceph.com/issues/55765 and https://github.com/ceph/ceph/pull/47194/commits is waiting for rel... Tobias Urdin
05:47 AM rgw Bug #56248: crash: rgw::ARN::ARN(rgw_bucket const&)
We had a RGW crash on this as well some hours ago.... Tobias Urdin
05:06 AM Backport #57916 (In Progress): quincy: make-dist creates ceph.spec with incorrect Release tag for...
Tim Serong
03:08 AM Backport #57916 (In Progress): quincy: make-dist creates ceph.spec with incorrect Release tag for...
https://github.com/ceph/ceph/pull/48613 Backport Bot
04:00 AM Orchestrator Cleanup #50168 (Resolved): cephadm: move bin/cephadm from the git tree to download.ceph.com
Michael Fritch
03:02 AM Orchestrator Backport #57638 (Resolved): pacific: applying osd service spec with size filter fails if there's ...
Tim Serong
02:58 AM Bug #57893 (Pending Backport): make-dist creates ceph.spec with incorrect Release tag for SUSE-ba...
Tim Serong

10/24/2022

06:18 PM RADOS Bug #57852: osd: unhealthy osd cannot be marked down in time
Not a something we introduced recently but still worth taking a look if nothing urgent is not the plate. Radoslaw Zarzynski
06:17 PM RADOS Bug #57852 (New): osd: unhealthy osd cannot be marked down in time
For the detailed explanation! Radoslaw Zarzynski
06:10 PM RADOS Bug #57845: MOSDRepOp::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_OCTOPUS...
Just before the crash time-outs were seen:... Radoslaw Zarzynski
06:05 PM RADOS Bug #57915: LibRadosWatchNotify.AioNotify - error callback ceph_assert(ref > 0)
Yes, this is one of the Notify bugs that i hit during my tests Nitzan Mordechai
05:14 PM RADOS Bug #57915: LibRadosWatchNotify.AioNotify - error callback ceph_assert(ref > 0)
Nitzan, I recall you mentioned about some watch-related tests on today's stand-up. Is this one of them? Radoslaw Zarzynski
05:57 PM RADOS Bug #57546: rados/thrash-erasure-code: wait_for_recovery timeout due to "active+clean+remapped+la...
As this is about EC: can be acting's items duplicated? Radoslaw Zarzynski
05:55 PM RADOS Bug #57546: rados/thrash-erasure-code: wait_for_recovery timeout due to "active+clean+remapped+la...
If https://github.com/ceph/ceph/pull/47901/commits/0d07b406dc2f854363f7ae9b970e980400f4f03e is the actual culprit, th... Radoslaw Zarzynski
05:42 PM RADOS Bug #57883: test-erasure-code.sh: TEST_rados_put_get_jerasure fails on "rados_put_get: grep '\<5...
It looks we asked for taking osd.5 down, got a confirmation the command was handled by mon and then @get_osd@ said %5... Radoslaw Zarzynski
05:25 PM RADOS Bug #57900: mon/crush_ops.sh: mons out of quorum
Just **suggestion** from the bug scrub: this is a mon thrashing test. None of mon loga seems to have a trace of crash... Radoslaw Zarzynski
05:18 PM RADOS Bug #55141: thrashers/fastread: assertion failure: rollback_info_trimmed_to == head
Well, just found a new occurance. Radoslaw Zarzynski
05:11 PM RADOS Bug #55141: thrashers/fastread: assertion failure: rollback_info_trimmed_to == head
Lowering the priority as we haven't seen a reoccurence last time. Radoslaw Zarzynski
05:17 PM RADOS Bug #57913 (Duplicate): Thrashosd: timeout 120 ceph --cluster ceph osd pool rm unique_pool_2 uniq...
In the teuthology log:... Radoslaw Zarzynski
05:10 PM RADOS Bug #57529 (Fix Under Review): mclock backfill is getting higher priority than WPQ
Radoslaw Zarzynski
04:41 PM Bug #57914: centos 8 build failed
All branches failed the same way Yuri Weinstein
03:10 PM Bug #57914: centos 8 build failed
from those logs, i see the builds succeeding. at the end of both, i see:
> Error: authenticating creds for "quay.cep...
Casey Bodley
03:52 PM rgw Bug #19988 (Resolved): RGW: can't stack compression and encryption filters
Casey Bodley
11:37 AM rgw Bug #44660: Multipart re-uploads cause orphan data
Looking at the code. In `MultipartObjectProcessor::process_first_chunk`, if writing the multipart object first chunk ... Mykola Golub
11:24 AM bluestore Bug #57895: OSD crash in Onode::put()
OK, thanks Igor for your confirmation, I'm reviewing your patch, we can discuss over there. dongdong tao
04:06 AM RADOS Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
Laura Flores wrote:
> Notes from the rados suite review:
>
> We may need to check if we're shutting down while se...
Brad Hubbard
02:48 AM bluestore Bug #57855: cannot enable level_compaction_dynamic_level_bytes
db_paths is not compatible with level_compaction_dynamic_level_bytes. Beom-Seok Park

10/23/2022

07:05 PM rgw Bug #57899 (Fix Under Review): admin: cannot use tenant with notification topic
Yuval Lifshitz
11:45 AM RADOS Bug #57915 (New): LibRadosWatchNotify.AioNotify - error callback ceph_assert(ref > 0)
/a//nmordech-2022-10-23_05:26:13-rados:verify-wip-nm-51282-distro-default-smithi/7077932... Nitzan Mordechai
05:19 AM RADOS Bug #57699: slow osd boot with valgrind (reached maximum tries (50) after waiting for 300 seconds)
Sridher, yes, those trackers look the same, valgrind make the osd start slower, maybe that's the reason we are seeing... Nitzan Mordechai

10/22/2022

03:16 AM Orchestrator Bug #57800: ceph orch upgrade does not appear to work with FQNDs.
Seems I haven't seen the "host address is empty" error in about 10 days now.... Not sure if that is because of DNS,... Brian Woods

10/21/2022

09:16 PM Bug #57914 (Resolved): centos 8 build failed
I see it on main and pacific
https://jenkins.ceph.com/job/ceph-dev-build/ARCH=x86_64,AVAILABLE_ARCH=x86_64,AVAILAB...
Yuri Weinstein
06:26 PM bluestore Bug #53002: crash BlueStore::Onode::put from BlueStore::TransContext::~TransContext
Hi Sven,
Thanks for reporting telemetry! The issue you reported is tracked in https://tracker.ceph.com/issues/5620...
Yaarit Hatuka
04:41 PM bluestore Bug #53002: crash BlueStore::Onode::put from BlueStore::TransContext::~TransContext
We have almost daily crashes on our octopus cluster, which are also reported via telemetry, which look like this bug,... Anonymous
05:31 PM Backport #57505 (Resolved): quincy: openSUSE Leap 15.x needs to explicitly specify gcc-11
Ilya Dryomov
03:26 PM Backport #57505: quincy: openSUSE Leap 15.x needs to explicitly specify gcc-11
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/48058
merged
Yuri Weinstein
05:28 PM rbd Bug #52915: rbd du versus rbd diff values wildly different when snapshots are present
Alex Yarbrough wrote:
> If I _rbd du_ all of the ~200 images that I have, and sum the result, my total is about 24 T...
Ilya Dryomov
03:25 PM rbd Bug #52915: rbd du versus rbd diff values wildly different when snapshots are present
Ilya, first thank you for the time you put into your messages. I am aware of the issue regarding RBD object size vers... Alex Yarbrough
04:52 PM CephFS Backport #57719: quincy: Test failure: test_subvolume_group_ls_filter_internal_directories (tasks...
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/48327
merged
Yuri Weinstein
04:19 PM RADOS Bug #55809: "Leak_IndirectlyLost" valgrind report on mon.c
/a/yuriw-2022-10-12_16:24:50-rados-wip-yuri8-testing-2022-10-12-0718-quincy-distro-default-smithi/7063948/ Kamoltat (Junior) Sirivadhna
04:16 PM RADOS Bug #57913 (Duplicate): Thrashosd: timeout 120 ceph --cluster ceph osd pool rm unique_pool_2 uniq...
/a/yuriw-2022-10-12_16:24:50-rados-wip-yuri8-testing-2022-10-12-0718-quincy-distro-default-smithi/7063868/
rados/t...
Kamoltat (Junior) Sirivadhna
03:57 PM rbd Backport #57843 (Resolved): quincy: rbd CLI inconsistencies affecting "--namespace" arg
Konstantin Shalygin
03:29 PM rbd Backport #57843: quincy: rbd CLI inconsistencies affecting "--namespace" arg
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/48458
merged
Yuri Weinstein
03:55 PM Orchestrator Bug #56951: rook/smoke: Updating cephclusters/rook-ceph is forbidden
/a/yuriw-2022-10-12_16:24:50-rados-wip-yuri8-testing-2022-10-12-0718-quincy-distro-default-smithi/7063866/ Kamoltat (Junior) Sirivadhna
03:36 PM Orchestrator Bug #52321: qa/tasks/rook times out: 'check osd count' reached maximum tries (90) after waiting f...
/a/yuriw-2022-10-12_16:24:50-rados-wip-yuri8-testing-2022-10-12-0718-quincy-distro-default-smithi/7063706/ Kamoltat (Junior) Sirivadhna
12:41 PM Dashboard Bug #57912 (Fix Under Review): mgr/dashboard: Dashboard creation of NFS exports with RGW backend ...
Volker Theile
12:12 PM Dashboard Bug #57912 (Fix Under Review): mgr/dashboard: Dashboard creation of NFS exports with RGW backend ...
When attempting to create a NFS export with RGW as the backend from Dashboard, this fails as per the description.
Ho...
Francesco Torchia
10:39 AM rgw Bug #57911 (Pending Backport): Segmentation fault when uploading file with bucket policy on Quincy
RGW crashes when a file is uploaded and a bucket policy has been set up.
The crash has been "reproduced for latest...
Jan Graichen
10:28 AM bluestore Bug #57895: OSD crash in Onode::put()
dongdong tao wrote:
> Yaarit Hatuka wrote:
> > Status changed from "New" to "Duplicate" since this issue duplicates...
Igor Fedotov
12:20 AM bluestore Bug #57895: OSD crash in Onode::put()
Yaarit Hatuka wrote:
> Status changed from "New" to "Duplicate" since this issue duplicates https://tracker.ceph.com...
dongdong tao
09:53 AM Orchestrator Bug #57910 (New): ingress: HAProxy fails to start because keepalived IP address not yet available...
After deploying a new cluster _sometimes_ HAProxy fails to start on ingress nodes:... Voja Molani
08:41 AM RADOS Bug #57699: slow osd boot with valgrind (reached maximum tries (50) after waiting for 300 seconds)
@Nitzan Mordechai this is probably similar to,
https://tracker.ceph.com/issues/52948 and https://tracker.ceph.com/is...
Sridhar Seshasayee
07:47 AM RADOS Fix #57040 (Resolved): osd: Update osd's IOPS capacity using async Context completion instead of ...
Sridhar Seshasayee
07:46 AM RADOS Backport #57443 (Resolved): quincy: osd: Update osd's IOPS capacity using async Context completio...
Sridhar Seshasayee
06:03 AM Orchestrator Feature #55490: cephadm: allow passing grafana cert and frontend-api-url in spec
The OP mentioned @set-grafana-frontend-api-url@ but missed mentioning setting @set-grafana-api-url@ from a spec which... Voja Molani

10/20/2022

11:33 PM RADOS Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
Notes from the rados suite review:
We may need to check if we're shutting down while sending pg stats; if so, we d...
Laura Flores
10:47 PM bluestore Feature #57785: fragmentation score in metrics
I'm just a user so I can't answer some of the questions. I'll fill in what I know though.
1. Not sure
3. No priva...
Kevin Fox
10:26 PM bluestore Feature #57785: fragmentation score in metrics
Hey Kevin (and Vikhyat),
I have a few questions regarding the fragmentation score:
1. Where are all the places ...
Laura Flores
09:25 PM rbd Bug #52915: rbd du versus rbd diff values wildly different when snapshots are present
Going back to CephRBD_NVMe/vm-101-disk-0 image, your "rbd du" output makes perfect sense to me based on what you said... Ilya Dryomov
09:12 PM rbd Bug #52915: rbd du versus rbd diff values wildly different when snapshots are present
Hi Alex,
"rbd diff CephRBD_NVMe/vm-101-disk-0" reports the allocated areas of the image without taking snapshots i...
Ilya Dryomov
03:40 PM rbd Bug #52915: rbd du versus rbd diff values wildly different when snapshots are present
Greetings all. I have read through the related issues that are resolved. I do not believe this issue is duplicated or... Alex Yarbrough
06:11 PM Orchestrator Feature #57909 (Resolved): cephadm: make logging host refresh data to debug logs configurable
The amount of data we log in the debug logs when refreshing a host is too verbose, even for debug level. It renders t... Adam King
04:14 PM Support #57908 (New): rgw common prefix performance on large bucket
Hi, I'm facing the same issue metioned here:
https://lists.ceph.io/hyperkitty/list/dev@ceph.io/thread/36P62BOOCJBVVJ...
Jiayu Sun
04:09 PM ceph-volume Bug #57907: ceph-volume complains about "Insufficient space (<5GB)" on 1.75TB device
I add a workaround screenshot to disable Hotplug in Bios. Björn Lässig
04:01 PM ceph-volume Bug #57907: ceph-volume complains about "Insufficient space (<5GB)" on 1.75TB device
The Problem is that in @util/device.py@ line 582
The call for @int(self.sys_api.get('size', 0))@ is always 0 if s...
Björn Lässig
03:15 PM ceph-volume Bug #57907 (Duplicate): ceph-volume complains about "Insufficient space (<5GB)" on 1.75TB device
On a one week old working cluster 17.2.5, i try to add another host with 2 SSDs and 4 HDDs.
None of them is shown as...
Björn Lässig
03:07 PM RADOS Bug #57152 (Resolved): segfault in librados via libcephsqlite
Matan Breizman
03:06 PM RADOS Backport #57373 (Resolved): pacific: segfault in librados via libcephsqlite
Matan Breizman
02:56 PM RADOS Backport #57373: pacific: segfault in librados via libcephsqlite
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/48187
merged
Yuri Weinstein
03:01 PM Orchestrator Backport #57638: pacific: applying osd service spec with size filter fails if there's tiny (KB-si...
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/48243
merged
Yuri Weinstein
02:58 PM Orchestrator Backport #57639: pacific: cephadm: `ceph orch ps` doesn't list container versions in some cases
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/48210
merged
Yuri Weinstein
02:55 PM ceph-volume Backport #57566: pacific: inventory a device get_partitions_facts called many times
Guillaume Abrioux wrote:
> https://github.com/ceph/ceph/pull/48126
merged
Yuri Weinstein
02:53 PM ceph-volume Backport #57564: pacific: functional test lvm-centos8-filestore-create is broken
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/48123
merged
Yuri Weinstein
02:45 PM Bug #57906 (New): ceph -s show too many executing tasks
I got a log of execution tasks with ceph -s but I'm sure there is nothing running. How I can clean this messages? Als... Jiayu Sun
02:24 PM rgw Bug #57770 (Triaged): RGW (pacific) misplaces index entries after dynamically resharding bucket
Casey Bodley
02:24 PM rgw Bug #57770 (New): RGW (pacific) misplaces index entries after dynamically resharding bucket
Casey Bodley
02:21 PM rgw Bug #57783: multisite: data sync reports shards behind after source zone fully trims datalog
related work in https://github.com/ceph/ceph/pull/47682 and https://github.com/ceph/ceph/pull/48397
Casey Bodley
02:20 PM rgw Bug #57804: Enabling sync on bucket not working
i can only recommend running the command until it succeeds Casey Bodley
02:18 PM rgw Bug #57853 (Need More Info): multisite sync process block after long time running
Casey Bodley
02:16 PM rgw Bug #57901 (Fix Under Review): s3:ListBuckets response limited to 1000 buckets (by default) since...
Casey Bodley
02:11 PM rgw Bug #57231 (Resolved): Valgrind: jump on unitialized in s3select
Casey Bodley
01:51 PM bluestore Bug #57895 (Duplicate): OSD crash in Onode::put()
Status changed from "New" to "Duplicate" since this issue duplicates https://tracker.ceph.com/issues/56382. Yaarit Hatuka
10:10 AM bluestore Bug #57895: OSD crash in Onode::put()
Please help to review this one, https://github.com/ceph/ceph/pull/48566
Here is the related log: https://pastebin....
dongdong tao
01:30 PM rgw Bug #57905 (Pending Backport): multisite: terminate called after throwing an instance of 'ceph::b...
example from rgw/multisite suite: http://qa-proxy.ceph.com/teuthology/cbodley-2022-10-19_23:28:37-rgw-wip-cbodley-tes... Casey Bodley
10:54 AM bluestore Bug #56851: crash: int BlueStore::read_allocation_from_onodes(SimpleBitmap*, BlueStore::read_allo...
@Sudhin - curious if you can reproduce the issue? If so it would be great to get OSD log with debug-bluestore set to ... Igor Fedotov
10:52 AM bluestore Bug #52464: FAILED ceph_assert(current_shard->second->valid())
IMO this is rather related to DB sharding stuff introduced by https://github.com/ceph/ceph/pull/34006
Hence reassign...
Igor Fedotov
10:46 AM bluestore Bug #52464: FAILED ceph_assert(current_shard->second->valid())
Neha Ojha wrote:
> Gabi, I am assigning it to you for now, since this looks related to NCB.
No, apparently this i...
Igor Fedotov
09:49 AM bluestore Bug #57857 (Fix Under Review): KernelDevice::read doesn't translate error codes correctly
Igor Fedotov
09:40 AM bluestore Bug #56382 (Fix Under Review): ONode ref counting is broken
Igor Fedotov
09:10 AM bluestore Bug #56382 (Pending Backport): ONode ref counting is broken
Igor Fedotov
06:33 AM CephFS Bug #54557 (Fix Under Review): scrub repair does not clear earlier damage health status
Kotresh Hiremath Ravishankar
06:24 AM Dashboard Bug #57284 (Resolved): mgr/dashboard: 500 internal server error seen on ingress service creation ...
Nizamudeen A
06:24 AM Dashboard Backport #57485 (Resolved): pacific: mgr/dashboard: 500 internal server error seen on ingress ser...
Nizamudeen A
05:57 AM rgw Bug #57562: multisite replication issue on Quincy
We have an example scenario here where one of the objects in a bucket failed to be synced to the secondary.
* Mdlog...
Jane Zhu
05:28 AM CephFS Backport #57716 (Resolved): pacific: libcephfs: incorrectly showing the size for snapdirs when st...
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/48413
Merged.
Venky Shankar
04:54 AM CephFS Backport #57874 (In Progress): quincy: Permissions of the .snap directory do not inherit ACLs
Venky Shankar
04:17 AM CephFS Backport #57723 (Resolved): pacific: qa: test_subvolume_snapshot_info_if_orphan_clone fails
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/48417
Merged.
Venky Shankar

10/19/2022

11:31 PM rbd Bug #57902 (Resolved): [rbd-nbd] add --snap-id option to "rbd device map" to allow mapping arbitr...
As any snapshot in a non-user snapshot namespace, mirror snapshots are inaccessible to most rbd CLI commands. As suc... Ilya Dryomov
11:16 PM rbd Bug #57066 (Fix Under Review): rbd snap list not change the last read when more than 64 group snaps
Ilya Dryomov
09:28 PM rgw Bug #57901 (Resolved): s3:ListBuckets response limited to 1000 buckets (by default) since Octopus
Since Octopus, s3:ListBuckets is limited to rgw_list_buckets_max_chunk buckets in its response due to loss of truncat... Joshua Baergen
09:21 PM RADOS Backport #52747 (In Progress): pacific: MON_DOWN during mon_join process
Laura Flores
09:09 PM RADOS Backport #52746 (Rejected): octopus: MON_DOWN during mon_join process
Octopus is EOL. Laura Flores
08:59 PM RADOS Bug #43584: MON_DOWN during mon_join process
/a/yuriw-2022-10-05_20:44:57-rados-wip-yuri4-testing-2022-10-05-0917-pacific-distro-default-smithi/7055594 Laura Flores
08:46 PM RADOS Bug #57900 (In Progress): mon/crush_ops.sh: mons out of quorum
/a/teuthology-2022-10-09_07:01:03-rados-quincy-distro-default-smithi/7059463... Laura Flores
05:56 PM Orchestrator Bug #57341: cephadm: failures from tests comparing output strings are difficult to debug
See attached screenshot for a better colorized example. John Mulligan
05:53 PM Orchestrator Bug #57341: cephadm: failures from tests comparing output strings are difficult to debug
I did a few minutes of research and found two packages that may help:
pytest-mock (https://pytest-mock.readthedocs.i...
John Mulligan
03:38 PM Linux kernel client Bug #57898: ceph client extremely slow kernel version between 5.15 and 6.0
Even with the ceph-fuse method in the body it gets slow again over time. Minjong Kim
12:47 PM Linux kernel client Bug #57898 (In Progress): ceph client extremely slow kernel version between 5.15 and 6.0
hello? I am very new to ceph. Thank you for taking that into consideration and reading.
I recently changed the ker...
Minjong Kim
03:20 PM RADOS Bug #57698 (Pending Backport): osd/scrub: "scrub a chunk" requests are sent to the wrong set of r...
Ronen Friedman
03:05 PM rgw Bug #16767 (In Progress): RadosGW Multipart Cleanup Failure
Matt Benjamin
02:55 PM rgw Bug #16767: RadosGW Multipart Cleanup Failure
Vicki Good wrote:
> I've encountered this bug in Ceph 14 and 15 and it's a pretty big problem for us for the same re...
Rok Jaklic
02:16 PM CephFS Backport #57875 (In Progress): pacific: Permissions of the .snap directory do not inherit ACLs
Venky Shankar
01:45 PM bluestore Bug #57855: cannot enable level_compaction_dynamic_level_bytes
I found that the level_compaction_dynamic_level_bytes option does not apply if opt.db_paths exists when opening rocks... Beom-Seok Park
01:26 PM bluestore Bug #55324: rocksdb omap iterators become extremely slow in the presence of large delete range to...
Benoît Knecht wrote:
> > I see this was backported in: https://github.com/ceph/ceph/pull/45963 but was later reverte...
Igor Fedotov
12:09 PM bluestore Bug #55324: rocksdb omap iterators become extremely slow in the presence of large delete range to...
Sven Kieske wrote:
> I assume this was not backported to the last octopus release?
Yes, the octopus is EOL
Konstantin Shalygin
12:04 PM bluestore Bug #55324: rocksdb omap iterators become extremely slow in the presence of large delete range to...
> I see this was backported in: https://github.com/ceph/ceph/pull/45963 but was later reverted in https://github.com/... Benoît Knecht
11:21 AM bluestore Bug #55324: rocksdb omap iterators become extremely slow in the presence of large delete range to...
Sven Kieske wrote:
> I don't see the PR showing up in any release notes. I assume this was not backported to the las...
Anonymous
11:16 AM bluestore Bug #55324: rocksdb omap iterators become extremely slow in the presence of large delete range to...
I don't see the PR showing up in any release notes. I assume this was not backported to the last octopus release? In ... Anonymous
09:06 AM bluestore Bug #55324 (Resolved): rocksdb omap iterators become extremely slow in the presence of large dele...
Igor Fedotov
01:20 PM rgw-testing Bug #54104: test_rgw_datacache.py: s3cmd fails with '403 (SignatureDoesNotMatch)' in ubuntu
ping @Mark, this remains a blocker for enabling ubuntu in the rgw/verify suite. that subsuite contains most of our fu... Casey Bodley
01:11 PM rgw Bug #57899 (Pending Backport): admin: cannot use tenant with notification topic
issue was a regression introduced in: 200f71a90c9e77c91452cec128c2c8be0d3d6f1f
topic notification commands should be...
Yuval Lifshitz
01:03 PM mgr Bug #55046 (Resolved): mgr: perf counters node exporter
Konstantin Shalygin
12:59 PM mgr Backport #57141 (Resolved): quincy: mgr: perf counters node exporter
Avan Thakkar
12:27 PM Orchestrator Bug #57897 (New): ceph mgr restart causes restart of all iscsi daemons in a loop
We have observed that since v17.2.4, a restart of the active ceph mgr appears to cause all iSCSI daemons to restart a... Dan Poltawski
11:49 AM Dashboard Feature #57896 (New): mgr/dashboard: create per component high level dashboard view
h3. Description of problem
A great improvemnte to the dashboard would be to have a higher level view of each compo...
Pere Díaz Bou
11:49 AM bluestore Bug #57895: OSD crash in Onode::put()
This is observed from 15.2.16, but I believe the code defect to cause this kind of race condition is still present on... dongdong tao
11:42 AM bluestore Bug #57895 (Duplicate): OSD crash in Onode::put()

This issue happens when an Onode is being trimmed right away after it's unpinned. This is possible when the LRU lis...
dongdong tao
11:01 AM Bug #57868: iSCSI: rbd-target-api reports python version and identified 'unsupported version' tri...
This likely goes for all ceph-container containers... Guillaume, could you please take a look? Ilya Dryomov
10:30 AM Orchestrator Feature #57894 (Fix Under Review): Move prometheus spec check to the service_spec module
Redouane Kachach Elhichou
10:18 AM Orchestrator Feature #57894 (Pending Backport): Move prometheus spec check to the service_spec module
Redouane Kachach Elhichou
10:29 AM RADOS Bug #57699: slow osd boot with valgrind (reached maximum tries (50) after waiting for 300 seconds)
The issue is that we having deadlock on specific condition. When we are trying to update the mClockScheduler config c... Nitzan Mordechai
09:13 AM CephFS Bug #57882: Kernel Oops, kernel NULL pointer dereference
Xiubo Li wrote:
> It's a known bug and I will check this today or this week.
Oh my ! I did search for anything pr...
Julien Banchet
08:46 AM bluestore Bug #55328 (Closed): OSD crashed due to checksum error
Igor Fedotov
08:45 AM Bug #57893 (Fix Under Review): make-dist creates ceph.spec with incorrect Release tag for SUSE-ba...
Tim Serong
08:04 AM Bug #57893 (Pending Backport): make-dist creates ceph.spec with incorrect Release tag for SUSE-ba...
@ceph.spec.in@ says:... Tim Serong
07:43 AM Dashboard Bug #57805 (Pending Backport): mgr/dashboard: Unable to change subuser permission
Nizamudeen A
07:42 AM Dashboard Bug #57805 (Resolved): mgr/dashboard: Unable to change subuser permission
Nizamudeen A
07:42 AM Dashboard Backport #57841 (Resolved): quincy: mgr/dashboard: Unable to change subuser permission
Nizamudeen A
07:33 AM Dashboard Feature #57826 (Resolved): mgr/dashboard: add server side encryption to rgw/s3
Nizamudeen A
07:33 AM Dashboard Backport #57835 (Resolved): quincy: mgr/dashboard: add server side encryption to rgw/s3
Nizamudeen A
05:57 AM rbd Bug #57872 (Fix Under Review): [pwl] inconsistent "rbd status" output (clean = true but dirty_byt...
CONGMIN YIN
05:31 AM RADOS Bug #57546: rados/thrash-erasure-code: wait_for_recovery timeout due to "active+clean+remapped+la...
I was able to reproduce this using the test Laura mentioned above - http://pulpito.front.sepia.ceph.com/amathuri-2022... Aishwarya Mathuria
05:12 AM Dashboard Bug #39726 (Resolved): mgr/dashboard: "Striping" feature checkbox missing in RBD image dialog
Nizamudeen A
05:12 AM Dashboard Backport #56566 (Resolved): pacific: mgr/dashboard: "Striping" feature checkbox missing in RBD im...
Nizamudeen A
05:06 AM crimson Bug #57629: crimson: segfault during mkfs
using GCC 12.2.0 on ubuntu 22.04 facing the same problem. Jianxin Li
03:26 AM crimson Bug #57549: Crimson: Alienstore not work after ceph enable c++20
This problem disappeared after update GCC complier to the 12.2.0 version. And I met the Segmentation fault on https:/... Jianxin Li

10/18/2022

07:16 PM Dashboard Bug #48258: mgr/dashboard: Switch from tslint to eslint
Thanks Nizam, will get working Ngwa Sedrick Meh
06:25 PM Documentation #57858: v17.2.4 release does not contain latest cherry-picks
Bottom line: The quincy-release branch (and future release branches) should be up-to-date on the Ceph repository for ... Laura Flores
06:04 PM Orchestrator Bug #57891 (Resolved): [Gibba Cluster] HEALTH_ERR: Upgrade: failed due to an unexpected exception
- Upgrade paused due to one host not being reachable in the cluster.
- Resumed the upgrade with the resume command
...
Vikhyat Umrao
05:29 PM Bug #57890: cmd_getval() throws but many callers don't catch the exception
For reference, here are crashes with `cmd_getval` in their backtrace:
http://telemetry.front.sepia.ceph.com:4000/d/N...
Yaarit Hatuka
05:02 PM Bug #57890 (New): cmd_getval() throws but many callers don't catch the exception
In https://github.com/ceph/ceph/pull/23557 we switched @cmd_getval()@ to throw on error. This family of functions hav... Radoslaw Zarzynski
04:31 PM RADOS Bug #51729: Upmap verification fails for multi-level crush rule
Chris, can you please provide your osdmap binary? Neha Ojha
04:13 PM rgw Backport #57889 (Rejected): pacific: amqp: rgw crash when ca location is used for amqp connections
Backport Bot
04:12 PM rgw Backport #57888 (In Progress): quincy: amqp: rgw crash when ca location is used for amqp connections
https://github.com/ceph/ceph/pull/54170 Backport Bot
04:08 PM rgw Bug #57850 (Pending Backport): amqp: rgw crash when ca location is used for amqp connections
Yuval Lifshitz
03:49 PM Orchestrator Backport #57787 (In Progress): quincy: mgr/nfs: Add a sectype field to nfs exports created by nfs...
Adam King
03:39 PM rgw Bug #57881 (Fix Under Review): LDAP invalid password resource leak fix
Casey Bodley
09:56 AM rgw Bug #57881: LDAP invalid password resource leak fix
I created a pull request for a possible fix:
https://github.com/ceph/ceph/pull/48509
Johannes Liebl
01:02 PM rgw Bug #57877 (Fix Under Review): rgw: some operations may not have a valid bucket object
Casey Bodley
09:53 AM mgr Backport #57887 (In Progress): pacific: mgr/prometheus: avoid duplicates and deleted entries for ...
Konstantin Shalygin
09:04 AM mgr Backport #57887 (Resolved): pacific: mgr/prometheus: avoid duplicates and deleted entries for rbd...
https://github.com/ceph/ceph/pull/48524 Backport Bot
09:49 AM mgr Backport #57886 (In Progress): quincy: mgr/prometheus: avoid duplicates and deleted entries for r...
Konstantin Shalygin
09:04 AM mgr Backport #57886 (Resolved): quincy: mgr/prometheus: avoid duplicates and deleted entries for rbd_...
https://github.com/ceph/ceph/pull/48523 Backport Bot
09:35 AM Linux kernel client Bug #47450 (Resolved): stop parsing the error string in the session reject message
Fixed in:... Xiubo Li
09:33 AM Linux kernel client Bug #46904: kclient: cluster [WRN] client.4478 isn't responding to mclientcaps(revoke)
Fixed it in kernel and the patchwork link: https://patchwork.kernel.org/project/ceph-devel/list/?series=686074 Xiubo Li
09:27 AM Backport #57885 (In Progress): quincy: disable system_pmdk on s390x for SUSE distros
Tim Serong
08:49 AM Backport #57885 (Resolved): quincy: disable system_pmdk on s390x for SUSE distros
https://github.com/ceph/ceph/pull/48522 Backport Bot
09:03 AM RADOS Bug #57845: MOSDRepOp::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_OCTOPUS...
Hi Neha,
the logs from the crash instance that I reported initially are already rotated out on the particular node...
Andreas Teuchert
09:00 AM mgr Bug #57797 (Pending Backport): mgr/prometheus: avoid duplicates and deleted entries for rbd_stats...
Avan Thakkar
08:41 AM Bug #57860 (Pending Backport): disable system_pmdk on s390x for SUSE distros
Ilya Dryomov
08:19 AM Orchestrator Bug #57096: osd not restarting after upgrading to quincy due to podman args --cgroups=split
I manually created the unit.meta, and it seems to work. thanks again. Ween Jiann Lee
06:28 AM Orchestrator Bug #57096: osd not restarting after upgrading to quincy due to podman args --cgroups=split
The unit.meta file is not yet present in Octopus. I'll try to figure something out or wait for the PR release.
Tha...
Ween Jiann Lee
02:48 AM RADOS Bug #57852: osd: unhealthy osd cannot be marked down in time
Radoslaw Zarzynski wrote:
> Could you please clarify a bit? Do you mean there some extra, unnecessary (from the POV ...
wencong wan
02:19 AM CephFS Backport #57880 (In Progress): pacific: NFS client unable to see newly created files when listing...
Ramana Raja
02:14 AM CephFS Backport #57879 (In Progress): quincy: NFS client unable to see newly created files when listing ...
Ramana Raja
12:52 AM CephFS Bug #57882 (Duplicate): Kernel Oops, kernel NULL pointer dereference
It's a known bug and I will check this today or this week. Xiubo Li

10/17/2022

07:29 PM Orchestrator Bug #57800: ceph orch upgrade does not appear to work with FQNDs.
alright, looking back at the original traceback... Adam King
06:55 PM Orchestrator Bug #57884 (Resolved): cephadm: attempting a daemon redeploy of the active mgr with a specified i...
If I run something like... Adam King
06:27 PM RADOS Bug #57796: after rebalance of pool via pgupmap balancer, continuous issues in monitor log
Link to the discussion on ceph-users: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/AZHAIGY3BIM4SGB... Radoslaw Zarzynski
06:20 PM RADOS Bug #57883: test-erasure-code.sh: TEST_rados_put_get_jerasure fails on "rados_put_get: grep '\<5...
Let's first see if it's easily reproducible:
http://pulpito.front.sepia.ceph.com/lflores-2022-10-17_18:19:55-rados:s...
Laura Flores
06:03 PM RADOS Bug #57883: test-erasure-code.sh: TEST_rados_put_get_jerasure fails on "rados_put_get: grep '\<5...
The failed function:
qa/standalone/erasure-code/test-erasure-code.sh...
Laura Flores
05:52 PM RADOS Bug #57883 (Resolved): test-erasure-code.sh: TEST_rados_put_get_jerasure fails on "rados_put_get:...
/a/yuriw-2022-10-13_17:24:48-rados-main-distro-default-smithi/7065580... Laura Flores
06:16 PM RADOS Bug #57845 (Need More Info): MOSDRepOp::encode_payload(uint64_t): Assertion `HAVE_FEATURE(feature...
These reports in telemetry look similar: http://telemetry.front.sepia.ceph.com:4000/d/Nvj6XTaMk/spec-search?orgId=1&v... Neha Ojha
06:08 PM RADOS Bug #57852 (Need More Info): osd: unhealthy osd cannot be marked down in time
Could you please clarify a bit? Do you mean there some extra, unnecessary (from the POV of jugging whether an OSD is ... Radoslaw Zarzynski
06:01 PM mgr Bug #57460: Json formatted ceph pg dump hangs on large clusters
Thanks, Radoslow! I'll look into modifying the patch as you suggested, targeting Reef. Ponnuvel P
05:48 PM RADOS Bug #57782: [mon] high cpu usage by fn_monstore thread
NOT A FIX (extra debugs): https://github.com/ceph/ceph/pull/48513 Radoslaw Zarzynski
05:45 PM RADOS Bug #57698 (Fix Under Review): osd/scrub: "scrub a chunk" requests are sent to the wrong set of r...
Neha Ojha
05:43 PM RADOS Bug #51729: Upmap verification fails for multi-level crush rule
A note from bug scrub: this is going to be assigned tomorrow. Radoslaw Zarzynski
02:49 PM Bug #57613: Kernel Oops, kernel NULL pointer dereference
Moved (copied) to cephfs, might have echo from a better spot :) this one can be closed. Julien Banchet
02:47 PM CephFS Bug #57882 (Duplicate): Kernel Oops, kernel NULL pointer dereference
(repost from Ceph (#57613), I couldn't find a way to move the bug entry from one project to another)
Hello everyon...
Julien Banchet
02:19 PM rbd Backport #57779 (Resolved): quincy: [test] fio 3.16 doesn't build on recent kernels due to remova...
Ilya Dryomov
02:10 PM rbd Backport #57779: quincy: [test] fio 3.16 doesn't build on recent kernels due to removal of linux/...
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/48386
merged
Yuri Weinstein
01:45 PM Dashboard Backport #57828 (Resolved): quincy: cephadm/test_dashboard_e2e.sh: Expected to find content: '/^f...
Nizamudeen A
01:25 PM rbd Tasks #54312: combine the journal and snapshot test scripts
Please set the state to Fix Under Review once the lab stuff is sorted out and you have a link to a test run. Ilya Dryomov
01:22 PM rbd Bug #57066 (In Progress): rbd snap list not change the last read when more than 64 group snaps
Ilya Dryomov
12:30 PM rgw Bug #57881 (Pending Backport): LDAP invalid password resource leak fix
I have noticed that in the case a User tries to log in using LDAP with a wrong password, two new LDAP sessions will b... Johannes Liebl
12:04 PM CephFS Backport #57880 (Resolved): pacific: NFS client unable to see newly created files when listing di...
https://github.com/ceph/ceph/pull/48521 Backport Bot
12:04 PM CephFS Backport #57879 (Resolved): quincy: NFS client unable to see newly created files when listing dir...
https://github.com/ceph/ceph/pull/48520 Backport Bot
11:57 AM CephFS Bug #57210 (Pending Backport): NFS client unable to see newly created files when listing director...
Venky Shankar
11:53 AM CephFS Backport #57261 (Resolved): pacific: standby-replay mds is removed from MDSMap unexpectedly
Venky Shankar
10:54 AM Orchestrator Feature #57878 (Resolved): Add typing checks for rgw module
Redouane Kachach Elhichou
10:23 AM bluestore Bug #57855: cannot enable level_compaction_dynamic_level_bytes
I did some more digging on this and found that this PR was the cause.
https://github.com/ceph/ceph/pull/43100
Beom-Seok Park
09:32 AM Orchestrator Bug #57876 (Fix Under Review): prometheus ERROR failed to collect metrics
Redouane Kachach Elhichou
09:13 AM Orchestrator Bug #57876 (Resolved): prometheus ERROR failed to collect metrics
... Redouane Kachach Elhichou
09:19 AM rgw Bug #57877 (Resolved): rgw: some operations may not have a valid bucket object
Some codepaths may not always have a valid bucket, so add checks to detect this. Abhishek Lekshmanan
08:57 AM Linux kernel client Bug #46904 (Fix Under Review): kclient: cluster [WRN] client.4478 isn't responding to mclientcaps...
Xiubo Li
04:52 AM Linux kernel client Bug #46904: kclient: cluster [WRN] client.4478 isn't responding to mclientcaps(revoke)
The MDS was waiting for _*Fw*_ caps:... Xiubo Li
03:43 AM Linux kernel client Bug #56524 (Resolved): xfstest-dev: generic/467 failed with "open_by_handle(/mnt/kcephfs.A/467-di...
Xiubo Li
03:42 AM Linux kernel client Bug #57321 (Resolved): xfstests: ceph/004 setfattr: /mnt/kcephfs.A/test-004/dest: Invalid argument
Xiubo Li
03:41 AM Linux kernel client Bug #57342 (Resolved): kclient: incorrectly showing the size for snapdirs when stating them
Xiubo Li

10/16/2022

02:50 PM CephFS Backport #57875 (Resolved): pacific: Permissions of the .snap directory do not inherit ACLs
https://github.com/ceph/ceph/pull/48553 Backport Bot
02:50 PM CephFS Backport #57874 (Resolved): quincy: Permissions of the .snap directory do not inherit ACLs
https://github.com/ceph/ceph/pull/48563 Backport Bot
02:49 PM CephFS Bug #57084 (Pending Backport): Permissions of the .snap directory do not inherit ACLs
Venky Shankar
02:46 PM CephFS Bug #57084 (Resolved): Permissions of the .snap directory do not inherit ACLs
Venky Shankar

10/15/2022

08:36 PM crimson Bug #57873 (New): crimson: override overrides.ceph.flavor in crimson_qa_overrides.yaml as well
overrides.ceph.flavor = default gets set by teuthology/suite/placeholder.py Samuel Just
09:19 AM rbd Bug #57872 (Resolved): [pwl] inconsistent "rbd status" output (clean = true but dirty_bytes = 61440)
This popped up in a quincy integration branch run, but the code in main is exactly the same:... Ilya Dryomov

10/14/2022

09:17 PM rgw Bug #52027: XML responses return different order of XML elements
Hi
I think this is not fully addressed.
I've added a comment to pull request https://github.com/ceph/ceph/pull/42...
Daniel Iwan
09:13 PM RADOS Bug #51729: Upmap verification fails for multi-level crush rule
Andras,
Thanks for the extra info. This needs to be addressed. Anyone?
Chris Durham
08:48 PM RADOS Bug #51729: Upmap verification fails for multi-level crush rule
Just to clarify - the error "verify_upmap number of buckets X exceeds desired Y" comes from the C++ code in ceph-mon ... Andras Pataki
06:47 PM RADOS Bug #51729: Upmap verification fails for multi-level crush rule
I am now seeing this issue on pacific, 16.2.10 on rocky8 linux.
If I have a >2 level rule on an ec pool (6+2), suc...
Chris Durham
06:54 PM rgw Backport #57430: quincy: key is used after move in RGWGetObj_ObjStore_S3::override_range_hdr
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/48228
merged
Yuri Weinstein
06:50 PM Orchestrator Bug #57870 (Resolved): cephadm: --apply-spec is trying to do too much and failing as a result
--apply-spec is intended to do 2 things:
1) distribute ssh keys to hosts with hosts specs in the applied spec
2) ...
Adam King
04:15 PM RADOS Bug #57698: osd/scrub: "scrub a chunk" requests are sent to the wrong set of replicas
Following some discussions: here are excerpts from a run demonstrating this issue.
Test run rfriedma-2022-09-28_15:5...
Ronen Friedman
04:04 PM Orchestrator Bug #57800: ceph orch upgrade does not appear to work with FQNDs.
Oh, by all combinations, I mean I created DNS entries for all hosts, not just ceph02. Brian Woods
04:03 PM Orchestrator Bug #57800: ceph orch upgrade does not appear to work with FQNDs.
I add DNS entries for all combinations. So both ceph02.oldname.local and ceph02.domain.local are now valid names but... Brian Woods
02:05 PM rbd Tasks #54312 (In Progress): combine the journal and snapshot test scripts
Christopher Hoffman
01:42 PM rgw Bug #44660: Multipart re-uploads cause orphan data
Writing on behalf of Ulrich Klein <Ulrich.Klein@ulrichklein.de>, he wanted to add some info to this tracker, below is... Dhairya Parmar
10:45 AM Orchestrator Bug #57781 (Rejected): Fix prometheus dependencies calculation
closing as the current behavior is correct. We just need to add some comments to clarify the logic. Redouane Kachach Elhichou
10:21 AM Orchestrator Bug #57366 (Pending Backport): prometheus is not re-deployed when service-discovery port changes
Redouane Kachach Elhichou
10:20 AM Orchestrator Bug #57816 (Fix Under Review): Add support to configure protocol (http or https) for Grafana url ...
Redouane Kachach Elhichou
09:24 AM Dashboard Bug #48258: mgr/dashboard: Switch from tslint to eslint
great, thanks Sedrick. you can assign it to you. There are two PRs opened currently. You can go over the discussions ... Nizamudeen A
09:19 AM Dashboard Bug #48258: mgr/dashboard: Switch from tslint to eslint
Hi, will like to work on this one Ngwa Sedrick Meh
08:15 AM rgw Bug #57804: Enabling sync on bucket not working
Hello Casey,
The init command ended after 60 minutes running.
Unfortunately the two errors are returned constan...
Anonymous
07:46 AM Bug #57868 (New): iSCSI: rbd-target-api reports python version and identified 'unsupported versio...
When running the cephadm deployed iSCSI container images, the API endpoint exposes python versions. This trigggers vu... Dan Poltawski
04:35 AM Dashboard Cleanup #57867 (Resolved): mgr/dashboard: migrate bootstrap 4 to 5
h3. Description of problem
_here_
h3. Environment
* @ceph version@ string:
* Platform (OS/distro/release)...
Nizamudeen A
04:34 AM Dashboard Cleanup #57866 (Resolved): mgr/dashboard: update to angular 13
Nizamudeen A
12:14 AM crimson Bug #57549: Crimson: Alienstore not work after ceph enable c++20
do you mean rados bench works on ubuntu 20.04 in your machine for aliestore? chunmei liu
 

Also available in: Atom