Project

General

Profile

Activity

From 12/16/2022 to 01/14/2023

01/14/2023

08:27 AM Bug #58461 (Fix Under Review): osd/scrub: replica-response timeout is handled without locking the PG
Ronen Friedman
08:25 AM Bug #58461 (Fix Under Review): osd/scrub: replica-response timeout is handled without locking the PG
In ReplicaReservations::no_reply_t, a callback calls handle_no_reply_timeout()
without first locking the PG.
Intr...
Ronen Friedman

01/13/2023

09:41 AM Bug #58370: OSD crash
Radoslaw Zarzynski wrote:
> PG the was 2.50:
>
> [...]
>
> The PG was the @Deleting@ substate:
>
> [...]
>...
yite gu

01/12/2023

08:48 PM Bug #58436 (Fix Under Review): ceph cluster log reporting log level in numeric format for the clo...
Prashant D
08:43 PM Bug #58436 (Fix Under Review): ceph cluster log reporting log level in numeric format for the clo...
The cluster log now reporting log level in integer value compared to human readable log level e.g DBG, INF etc
16735...
Prashant D
01:20 PM Bug #51194: PG recovery_unfound after scrub repair failed on primary
We had another occurrence of this on Pacific v16.2.9 Enrico Bocchi
11:28 AM Backport #58040 (In Progress): quincy: osd: add created_at and ceph_version_when_created metadata
Igor Fedotov

01/10/2023

02:12 PM Bug #58410: Set single compression algorithm as a default value in ms_osd_compression_algorithm i...
BZ link: https://bugzilla.redhat.com/show_bug.cgi?id=2155380 Shreyansh Sancheti
02:10 PM Bug #58410 (Pending Backport): Set single compression algorithm as a default value in ms_osd_comp...
Description of problem:
The default value for the compression parameter "ms_osd_compression_algorithm" is assigned t...
Shreyansh Sancheti
01:38 AM Bug #57977: osd:tick checking mon for new map
Radoslaw Zarzynski wrote:
> Per the comment #11 I'm redirecting Prashant's questions from comment #9 to the reporter...
yite gu
12:50 AM Documentation #58401 (Resolved): cephadm's "Replacing an OSD" instructions work better than RADOS...
Zac Dover

01/09/2023

06:50 PM Bug #58370: OSD crash
PG the was 2.50:... Radoslaw Zarzynski
06:36 PM Bug #57852 (In Progress): osd: unhealthy osd cannot be marked down in time
Radoslaw Zarzynski
06:35 PM Bug #57977: osd:tick checking mon for new map
Per the comment #11 I'm redirecting Prashant's questions from comment #9 to the reporter.
@yite gu: is the deploym...
Radoslaw Zarzynski
02:24 PM Bug #57977: osd:tick checking mon for new map
@Prashant, I was thinking about this further. Although it is a containerized env, hostpid=true so the PIDs should be ... Joshua Baergen
06:29 PM Bug #49689: osd/PeeringState.cc: ceph_abort_msg("past_interval start interval mismatch") start
Label assigned but blocked due to the lab issue. Bump up. Radoslaw Zarzynski
06:27 PM Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
Still blocked due to the lab issue. Bump up. Radoslaw Zarzynski
06:12 PM Documentation #58401: cephadm's "Replacing an OSD" instructions work better than RADOS's "Replaci...
https://github.com/ceph/ceph/pull/49677 Zac Dover
05:43 PM Documentation #58401 (Resolved): cephadm's "Replacing an OSD" instructions work better than RADOS...
<Infinoid> For posterity, https://docs.ceph.com/en/quincy/cephadm/services/osd/#replacing-an-osd seems to be working ... Zac Dover
02:40 PM Bug #58379 (In Progress): no active mgr after ~1 hour
Nitzan Mordechai

01/06/2023

11:06 PM Bug #44400: Marking OSD out causes primary-affinity 0 to be ignored when up_set has no common OSD...
Just confirming this is still present in pacific:... Dan van der Ster
05:55 PM Documentation #58374 (Resolved): crushtool flags remain undocumented in the crushtool manpage
Zac Dover
05:55 PM Documentation #58374: crushtool flags remain undocumented in the crushtool manpage
https://github.com/ceph/ceph/pull/49653 Zac Dover
05:37 PM Bug #57977: osd:tick checking mon for new map
@Prashant - thanks! Yes, this is containerized, so that's certainly possible in our case. Joshua Baergen
03:20 AM Bug #57977: osd:tick checking mon for new map
Radoslaw Zarzynski wrote:
> The issue during the upgrade looks awfully similar to a downstream Prashant has working ...
Prashant D
03:27 AM Bug #57852: osd: unhealthy osd cannot be marked down in time
Sure Radek. Let me have a look at this. Prashant D
01:50 AM Bug #58370: OSD crash
Radoslaw Zarzynski wrote:
> Is there the related log available by any chance?
yite gu

01/05/2023

04:00 PM Feature #58389 (New): CRUSH algorithm should support 1 copy on SSD/NVME and 2 copies on HDD (and ...
Brad Fitzpatrick makes the following request to Zac Dover in private correspondence on 05 Jan 2023:
"I'm kinda dis...
Zac Dover

01/04/2023

08:02 PM Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
This has been added to a testing batch. The holdup is that main builds are failing from a dependency. See https://tra... Laura Flores
06:52 PM Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
bumping this up (fix waits for testing) Radoslaw Zarzynski
07:34 PM Bug #58355 (Need More Info): OSD: segmentation fault in tc_newarray
A coredump would be really useful here.
BTW: it's better to paste the backtraces as plain text to let people searc...
Radoslaw Zarzynski
07:31 PM Bug #58356 (Need More Info): osd:segmentation fault in tcmalloc's ReleaseToCentralCache
Thanks for the report. Do you still the coredump maybe? Radoslaw Zarzynski
07:27 PM Bug #21592: LibRadosCWriteOps.CmpExt got 0 instead of -4095-1
... Radoslaw Zarzynski
07:18 PM Bug #58370 (Need More Info): OSD crash
Is there the related log available by any chance? Radoslaw Zarzynski
07:17 PM Bug #58052: Empty Pool (zero objects) shows usage.
Downloading manually. Neha is testing ceph-post-file. Radoslaw Zarzynski
06:57 PM Bug #57699: slow osd boot with valgrind (reached maximum tries (50) after waiting for 300 seconds)
bumping up (fix awaits qa) Radoslaw Zarzynski
06:54 PM Bug #49689: osd/PeeringState.cc: ceph_abort_msg("past_interval start interval mismatch") start
bumping up (fix awaits qa) Radoslaw Zarzynski
06:52 PM Backport #58381 (Resolved): quincy: mon:stretch-cluster: mishandled removed_ranks -> inconsistent...
Backport Bot
06:51 PM Backport #58380 (Resolved): pacific: mon:stretch-cluster: mishandled removed_ranks -> inconsisten...
Backport Bot
06:51 PM Bug #58049 (Pending Backport): mon:stretch-cluster: mishandled removed_ranks -> inconsistent peer...
Radoslaw Zarzynski
06:29 PM Bug #58379: no active mgr after ~1 hour
When the message :... Nitzan Mordechai
06:21 PM Bug #58379 (Resolved): no active mgr after ~1 hour
After checking the BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2106031
i was able to recreate the issue on main ...
Nitzan Mordechai

01/03/2023

09:12 AM Bug #50462: OSDs crash in osd/osd_types.cc: FAILED ceph_assert(clone_overlap.count(clone))
Justin Mammarella wrote:
> We are seeing this bug in Nautilus 14.2.15 to 14.2.22 replicated pool.
>
> Two of our...
hoan nv
08:43 AM Bug #57699: slow osd boot with valgrind (reached maximum tries (50) after waiting for 300 seconds)
Sergii Kuzko wrote:
> Hi
> Can you update the bug status
> Or transfer to the group of the current version 17.2.6
Sergii Kuzko
08:41 AM Bug #57699: slow osd boot with valgrind (reached maximum tries (50) after waiting for 300 seconds)
Hi
Can you update the bug status
Or transfer to the group of the current version 17.2.5
Sergii Kuzko
06:01 AM Documentation #58374 (Resolved): crushtool flags remain undocumented in the crushtool manpage
>2023-01-01: brad@danga.com: https://docs.ceph.com/en/quincy/man/8/crushtool/ seems out of date. I'm running the quin... Zac Dover

12/31/2022

08:56 AM Bug #58052: Empty Pool (zero objects) shows usage.
Any thoughts on this? Brian Woods

12/29/2022

08:55 AM Bug #58370 (Need More Info): OSD crash
... yite gu

12/28/2022

11:26 AM Bug #21592: LibRadosCWriteOps.CmpExt got 0 instead of -4095-1
/a/ksirivad-2022-12-22_17:58:01-rados-wip-ksirivad-testing-pacific-distro-default-smithi/7125137/ Nitzan Mordechai
11:26 AM Bug #58130: LibRadosAio.SimpleWrite hang and pkill
Kamoltat (Junior) Sirivadhna wrote:
> /a/ksirivad-2022-12-22_17:58:01-rados-wip-ksirivad-testing-pacific-distro-defa...
Nitzan Mordechai

12/26/2022

07:46 AM Bug #58305: src/mon/AuthMonitor.cc: FAILED ceph_assert(version > keys_ver)
Radoslaw Zarzynski wrote:
> Thanks for the log file! Would you be able to try to replicate with higher debugs levels...
yite gu
04:54 AM Bug #58356 (Need More Info): osd:segmentation fault in tcmalloc's ReleaseToCentralCache
osd crash. Program terminated with signal SIGSEGV, Segmentation fault Detailed stack information is in the attachme... 王子敬 wang
04:52 AM Bug #58130: LibRadosAio.SimpleWrite hang and pkill
/a/ksirivad-2022-12-22_17:58:01-rados-wip-ksirivad-testing-pacific-distro-default-smithi/7125137/ Kamoltat (Junior) Sirivadhna
03:50 AM Bug #58355 (Need More Info): OSD: segmentation fault in tc_newarray
The osd core appears when cosbench uploads data. Detailed stack information is in the attachment, in Bluestore::_write. 王子敬 wang

12/24/2022

10:10 AM Documentation #58354 (Resolved): doc/ceph-volume/lvm/encryption.rst is inaccurate -- LUKS version...
Stefan Kooman's email of 20 Dec 2022 to dev@ceph.io, bearing the subject line "ceph-volume questions / enhancements",... Zac Dover

12/21/2022

10:16 PM Bug #57546 (Fix Under Review): rados/thrash-erasure-code: wait_for_recovery timeout due to "activ...
Radoslaw Zarzynski
06:57 PM Bug #57546: rados/thrash-erasure-code: wait_for_recovery timeout due to "active+clean+remapped+la...
I guess in main we should revert the opposite commit as both are there. Radoslaw Zarzynski
09:27 PM Backport #58117: quincy: qa/workunits/rados/test_librados_build.sh: specify redirect in curl command
I know the backport is in progress but dumping this here just for reference.
/a/ksirivad-2022-12-21_15:23:02-rados...
Kamoltat (Junior) Sirivadhna
08:09 PM Backport #58337 (In Progress): pacific: mon-stretched_cluster: degraded stretched mode lead to Mo...
Neha Ojha
08:06 PM Backport #58337 (Rejected): pacific: mon-stretched_cluster: degraded stretched mode lead to Monit...
Original backport https://github.com/ceph/ceph/pull/48803 was reverted in https://github.com/ceph/ceph/pull/49412 due... Neha Ojha
08:07 PM Backport #58338 (In Progress): quincy: mon-stretched_cluster: degraded stretched mode lead to Mon...
Neha Ojha
08:06 PM Backport #58338 (Resolved): quincy: mon-stretched_cluster: degraded stretched mode lead to Monito...
https://github.com/ceph/ceph/pull/48802 Neha Ojha
07:50 PM Bug #58052: Empty Pool (zero objects) shows usage.
Radoslaw Zarzynski wrote:
> Glad you've found it! Would mind uploading via the @ceph-post-file@ (https://docs.ceph.c...
Brian Woods
07:10 PM Bug #58052: Empty Pool (zero objects) shows usage.
Glad you've found it! Would mind uploading via the @ceph-post-file@ (https://docs.ceph.com/en/quincy/man/8/ceph-post-... Radoslaw Zarzynski
07:33 PM Bug #58155 (Fix Under Review): mon:ceph_assert(m < ranks.size()) `different code path than tracke...
Radoslaw Zarzynski
07:32 PM Bug #58305: src/mon/AuthMonitor.cc: FAILED ceph_assert(version > keys_ver)
Thanks for the log file! Would you be able to try to replicate with higher debugs levels?
Perhaps something like: ...
Radoslaw Zarzynski
07:26 PM Bug #58106: when a large number of error ops appear in the OSDs,pglog does not trim.
Well, values around 600-900 kitems aren't looking very large to me. Definitely they are much, much smaller than anyth... Radoslaw Zarzynski
07:23 PM Backport #58336 (Resolved): pacific: qa/standalone/mon: --mon-initial-members setting causes us t...
Backport Bot
07:23 PM Backport #58335 (Resolved): quincy: qa/standalone/mon: --mon-initial-members setting causes us to...
Backport Bot
07:20 PM Bug #57937 (Rejected): pg autoscaler of rgw pools doesn't work after creating otp pool
Not a Ceph issue per the last comment. Radoslaw Zarzynski
07:19 PM Bug #58130: LibRadosAio.SimpleWrite hang and pkill
Basing comment #14 we have a "fix candidate" that might also with issue.
If that's correct, we may wait for merging ...
Radoslaw Zarzynski
07:16 PM Bug #58132 (Pending Backport): qa/standalone/mon: --mon-initial-members setting causes us to popu...
Radoslaw Zarzynski
07:16 PM Backport #58334 (Resolved): quincy: mon/monclient: update "unable to obtain rotating service keys...
https://github.com/ceph/ceph/pull/50405 Backport Bot
07:16 PM Backport #58333 (Rejected): pacific: mon/monclient: update "unable to obtain rotating service key...
https://github.com/ceph/ceph/pull/54556 Backport Bot
07:14 PM Bug #17170 (Pending Backport): mon/monclient: update "unable to obtain rotating service keys when...
Radoslaw Zarzynski
07:13 PM Bug #48896: osd/OSDMap.cc: FAILED ceph_assert(osd_weight.count(i.first))
Low due to the low occurrence frequency. Radoslaw Zarzynski
06:58 PM Bug #58240 (Fix Under Review): osd/scrub: modifying osd_deep_scrub_stride while pg is doing deep ...
Radoslaw Zarzynski
06:51 PM Bug #58239 (Resolved): pacific: src/mon/Monitor.cc: FAILED ceph_assert(osdmon()->is_writeable())
Radoslaw Zarzynski
06:50 PM Bug #57017: mon-stretched_cluster: degraded stretched mode lead to Monitor crash
The pacific backport got reverted in https://github.com/ceph/ceph/pull/49412. Radoslaw Zarzynski
06:44 PM Bug #51729 (In Progress): Upmap verification fails for multi-level crush rule
Radoslaw Zarzynski
03:25 PM Bug #57105: quincy: ceph osd pool set <pool> size math error
This was fixed in main https://github.com/ceph/ceph/pull/44430 but was not backported to Q.
Instead of backporting t...
Matan Breizman
10:50 AM Bug #57105 (Fix Under Review): quincy: ceph osd pool set <pool> size math error
This PR is proposed after a BZ was reporting the same issue.
Matan Breizman
11:22 AM Bug #58288: quincy: mon: pg_num_check() according to crush rule
After the revert is merged (https://github.com/ceph/ceph/pull/49465),
pg_num_check() will return to not taking the c...
Matan Breizman
10:53 AM Bug #54188: Setting too many PGs leads error handling overflow
Setting this tracker a duplicate. Seems like the same issue, and 57105 proposed PR should address this one as well. Matan Breizman

12/20/2022

09:57 PM Bug #47025: rados/test.sh: api_watch_notify_pp LibRadosWatchNotifyECPP.WatchNotify failed
https://github.com/ceph/ceph/pull/49109/commits/31750d5e8ae5f64edf934e2350dfa3c98df68b5a Brad Hubbard
09:56 PM Bug #47025 (Fix Under Review): rados/test.sh: api_watch_notify_pp LibRadosWatchNotifyECPP.WatchNo...
Brad Hubbard
12:06 PM Backport #58315 (In Progress): quincy: Valgrind reports memory "Leak_DefinitelyLost" errors.
Nitzan Mordechai
12:04 PM Backport #58314 (In Progress): pacific: Valgrind reports memory "Leak_DefinitelyLost" errors.
Nitzan Mordechai
11:56 AM Bug #58305: src/mon/AuthMonitor.cc: FAILED ceph_assert(version > keys_ver)
Radoslaw Zarzynski wrote:
> Thanks for the report! Do have a corresponding log or coredump by any chance?
This lo...
yite gu
09:23 AM Bug #58316: Ceph health metric Scraping still broken
BTW this is the output of @smartctl -a --json@ on the device:... Janek Bevendorff
09:17 AM Bug #58316 (New): Ceph health metric Scraping still broken
This was brought up in #46285 already, but the issue has been marked as rejected.
When I run @ceph device scrape-h...
Janek Bevendorff

12/19/2022

07:08 PM Backport #58315 (Resolved): quincy: Valgrind reports memory "Leak_DefinitelyLost" errors.
https://github.com/ceph/ceph/pull/49522 Backport Bot
07:08 PM Backport #58314 (Resolved): pacific: Valgrind reports memory "Leak_DefinitelyLost" errors.
https://github.com/ceph/ceph/pull/49521 Backport Bot
07:00 PM Bug #58218 (Duplicate): osd
Radoslaw Zarzynski
06:59 PM Bug #58178 (Need More Info): FAILED ceph_assert(last_e.version.version < e.version.version)
Radoslaw Zarzynski
06:59 PM Bug #52136 (Pending Backport): Valgrind reports memory "Leak_DefinitelyLost" errors.
Radoslaw Zarzynski
06:58 PM Bug #57751 (Resolved): LibRadosAio.SimpleWritePP hang and pkill
Radoslaw Zarzynski
06:56 PM Bug #58288 (In Progress): quincy: mon: pg_num_check() according to crush rule
Just updating the tracker's state to fit the reality. Radoslaw Zarzynski
06:47 PM Bug #51652: heartbeat timeouts on filestore OSDs while deleting objects in upgrade:pacific-p2p-pa...
Lowered the priority as @FileStore` is not only deprecated but also being removed right now. Radoslaw Zarzynski
06:40 PM Bug #58305 (Need More Info): src/mon/AuthMonitor.cc: FAILED ceph_assert(version > keys_ver)
Thanks for the report! Do have a corresponding log or coredump by any chance? Radoslaw Zarzynski
05:06 PM Documentation #46126: RGW docs lack an explanation of how permissions management works, especiall...
Sure, very much appreciated.
Matt
Matt Benjamin
05:03 PM Documentation #46126: RGW docs lack an explanation of how permissions management works, especiall...
Matt,
I don't mean to endorse dirtwash's rudeness. I mean to capture an impassioned--if inelegant and abusive--req...
Zac Dover
10:01 AM Bug #58281 (Rejected): osd:memory usage exceeds the osd_memory_target
Igor Fedotov
03:56 AM Bug #58281: osd:memory usage exceeds the osd_memory_target
Igor Fedotov wrote:
> Please note that osd_memory_target is not a hard limit. It's just 'target' OSD usage that OSD ...
yite gu

12/17/2022

07:47 PM Bug #58305 (Need More Info): src/mon/AuthMonitor.cc: FAILED ceph_assert(version > keys_ver)
... yite gu

12/16/2022

04:44 PM Bug #58304 (Fix Under Review): pybind: ioctx.get_omap_keys asserts if start_after parameter is no...
Igor Fedotov
04:30 PM Bug #58304 (In Progress): pybind: ioctx.get_omap_keys asserts if start_after parameter is non-empty
Igor Fedotov
04:29 PM Bug #58304 (Pending Backport): pybind: ioctx.get_omap_keys asserts if start_after parameter is no...
Igor Fedotov
 

Also available in: Atom