Project

General

Profile

Activity

From 11/22/2022 to 12/21/2022

12/21/2022

10:16 PM Bug #57546 (Fix Under Review): rados/thrash-erasure-code: wait_for_recovery timeout due to "activ...
Radoslaw Zarzynski
06:57 PM Bug #57546: rados/thrash-erasure-code: wait_for_recovery timeout due to "active+clean+remapped+la...
I guess in main we should revert the opposite commit as both are there. Radoslaw Zarzynski
09:27 PM Backport #58117: quincy: qa/workunits/rados/test_librados_build.sh: specify redirect in curl command
I know the backport is in progress but dumping this here just for reference.
/a/ksirivad-2022-12-21_15:23:02-rados...
Kamoltat (Junior) Sirivadhna
08:09 PM Backport #58337 (In Progress): pacific: mon-stretched_cluster: degraded stretched mode lead to Mo...
Neha Ojha
08:06 PM Backport #58337 (In Progress): pacific: mon-stretched_cluster: degraded stretched mode lead to Mo...
Original backport https://github.com/ceph/ceph/pull/48803 was reverted in https://github.com/ceph/ceph/pull/49412 due... Neha Ojha
08:07 PM Backport #58338 (In Progress): quincy: mon-stretched_cluster: degraded stretched mode lead to Mon...
Neha Ojha
08:06 PM Backport #58338 (Resolved): quincy: mon-stretched_cluster: degraded stretched mode lead to Monito...
https://github.com/ceph/ceph/pull/48802 Neha Ojha
07:50 PM Bug #58052: Empty Pool (zero objects) shows usage.
Radoslaw Zarzynski wrote:
> Glad you've found it! Would mind uploading via the @ceph-post-file@ (https://docs.ceph.c...
Brian Woods
07:10 PM Bug #58052: Empty Pool (zero objects) shows usage.
Glad you've found it! Would mind uploading via the @ceph-post-file@ (https://docs.ceph.com/en/quincy/man/8/ceph-post-... Radoslaw Zarzynski
07:33 PM Bug #58155 (Fix Under Review): mon:ceph_assert(m < ranks.size()) `different code path than tracke...
Radoslaw Zarzynski
07:32 PM Bug #58305: src/mon/AuthMonitor.cc: FAILED ceph_assert(version > keys_ver)
Thanks for the log file! Would you be able to try to replicate with higher debugs levels?
Perhaps something like: ...
Radoslaw Zarzynski
07:26 PM Bug #58106: when a large number of error ops appear in the OSDs,pglog does not trim.
Well, values around 600-900 kitems aren't looking very large to me. Definitely they are much, much smaller than anyth... Radoslaw Zarzynski
07:23 PM Backport #58336 (Resolved): pacific: qa/standalone/mon: --mon-initial-members setting causes us t...
Backport Bot
07:23 PM Backport #58335 (Resolved): quincy: qa/standalone/mon: --mon-initial-members setting causes us to...
Backport Bot
07:20 PM Bug #57937 (Rejected): pg autoscaler of rgw pools doesn't work after creating otp pool
Not a Ceph issue per the last comment. Radoslaw Zarzynski
07:19 PM Bug #58130: LibRadosAio.SimpleWrite hang and pkill
Basing comment #14 we have a "fix candidate" that might also with issue.
If that's correct, we may wait for merging ...
Radoslaw Zarzynski
07:16 PM Bug #58132 (Pending Backport): qa/standalone/mon: --mon-initial-members setting causes us to popu...
Radoslaw Zarzynski
07:16 PM Backport #58334 (Resolved): quincy: mon/monclient: update "unable to obtain rotating service keys...
https://github.com/ceph/ceph/pull/50405 Backport Bot
07:16 PM Backport #58333 (Rejected): pacific: mon/monclient: update "unable to obtain rotating service key...
https://github.com/ceph/ceph/pull/54556 Backport Bot
07:14 PM Bug #17170 (Pending Backport): mon/monclient: update "unable to obtain rotating service keys when...
Radoslaw Zarzynski
07:13 PM Bug #48896: osd/OSDMap.cc: FAILED ceph_assert(osd_weight.count(i.first))
Low due to the low occurrence frequency. Radoslaw Zarzynski
06:58 PM Bug #58240 (Fix Under Review): osd/scrub: modifying osd_deep_scrub_stride while pg is doing deep ...
Radoslaw Zarzynski
06:51 PM Bug #58239 (Resolved): pacific: src/mon/Monitor.cc: FAILED ceph_assert(osdmon()->is_writeable())
Radoslaw Zarzynski
06:50 PM Bug #57017: mon-stretched_cluster: degraded stretched mode lead to Monitor crash
The pacific backport got reverted in https://github.com/ceph/ceph/pull/49412. Radoslaw Zarzynski
06:44 PM Bug #51729 (In Progress): Upmap verification fails for multi-level crush rule
Radoslaw Zarzynski
03:25 PM Bug #57105: quincy: ceph osd pool set <pool> size math error
This was fixed in main https://github.com/ceph/ceph/pull/44430 but was not backported to Q.
Instead of backporting t...
Matan Breizman
10:50 AM Bug #57105 (Fix Under Review): quincy: ceph osd pool set <pool> size math error
This PR is proposed after a BZ was reporting the same issue.
Matan Breizman
11:22 AM Bug #58288: quincy: mon: pg_num_check() according to crush rule
After the revert is merged (https://github.com/ceph/ceph/pull/49465),
pg_num_check() will return to not taking the c...
Matan Breizman
10:53 AM Bug #54188: Setting too many PGs leads error handling overflow
Setting this tracker a duplicate. Seems like the same issue, and 57105 proposed PR should address this one as well. Matan Breizman

12/20/2022

09:57 PM Bug #47025: rados/test.sh: api_watch_notify_pp LibRadosWatchNotifyECPP.WatchNotify failed
https://github.com/ceph/ceph/pull/49109/commits/31750d5e8ae5f64edf934e2350dfa3c98df68b5a Brad Hubbard
09:56 PM Bug #47025 (Fix Under Review): rados/test.sh: api_watch_notify_pp LibRadosWatchNotifyECPP.WatchNo...
Brad Hubbard
12:06 PM Backport #58315 (In Progress): quincy: Valgrind reports memory "Leak_DefinitelyLost" errors.
Nitzan Mordechai
12:04 PM Backport #58314 (In Progress): pacific: Valgrind reports memory "Leak_DefinitelyLost" errors.
Nitzan Mordechai
11:56 AM Bug #58305: src/mon/AuthMonitor.cc: FAILED ceph_assert(version > keys_ver)
Radoslaw Zarzynski wrote:
> Thanks for the report! Do have a corresponding log or coredump by any chance?
This lo...
yite gu
09:23 AM Bug #58316: Ceph health metric Scraping still broken
BTW this is the output of @smartctl -a --json@ on the device:... Janek Bevendorff
09:17 AM Bug #58316 (New): Ceph health metric Scraping still broken
This was brought up in #46285 already, but the issue has been marked as rejected.
When I run @ceph device scrape-h...
Janek Bevendorff

12/19/2022

07:08 PM Backport #58315 (Resolved): quincy: Valgrind reports memory "Leak_DefinitelyLost" errors.
https://github.com/ceph/ceph/pull/49522 Backport Bot
07:08 PM Backport #58314 (Resolved): pacific: Valgrind reports memory "Leak_DefinitelyLost" errors.
https://github.com/ceph/ceph/pull/49521 Backport Bot
07:00 PM Bug #58218 (Duplicate): osd
Radoslaw Zarzynski
06:59 PM Bug #58178 (Need More Info): FAILED ceph_assert(last_e.version.version < e.version.version)
Radoslaw Zarzynski
06:59 PM Bug #52136 (Pending Backport): Valgrind reports memory "Leak_DefinitelyLost" errors.
Radoslaw Zarzynski
06:58 PM Bug #57751 (Resolved): LibRadosAio.SimpleWritePP hang and pkill
Radoslaw Zarzynski
06:56 PM Bug #58288 (In Progress): quincy: mon: pg_num_check() according to crush rule
Just updating the tracker's state to fit the reality. Radoslaw Zarzynski
06:47 PM Bug #51652: heartbeat timeouts on filestore OSDs while deleting objects in upgrade:pacific-p2p-pa...
Lowered the priority as @FileStore` is not only deprecated but also being removed right now. Radoslaw Zarzynski
06:40 PM Bug #58305 (Need More Info): src/mon/AuthMonitor.cc: FAILED ceph_assert(version > keys_ver)
Thanks for the report! Do have a corresponding log or coredump by any chance? Radoslaw Zarzynski
05:06 PM Documentation #46126: RGW docs lack an explanation of how permissions management works, especiall...
Sure, very much appreciated.
Matt
Matt Benjamin
05:03 PM Documentation #46126: RGW docs lack an explanation of how permissions management works, especiall...
Matt,
I don't mean to endorse dirtwash's rudeness. I mean to capture an impassioned--if inelegant and abusive--req...
Zac Dover
10:01 AM Bug #58281 (Rejected): osd:memory usage exceeds the osd_memory_target
Igor Fedotov
03:56 AM Bug #58281: osd:memory usage exceeds the osd_memory_target
Igor Fedotov wrote:
> Please note that osd_memory_target is not a hard limit. It's just 'target' OSD usage that OSD ...
yite gu

12/17/2022

07:47 PM Bug #58305 (Need More Info): src/mon/AuthMonitor.cc: FAILED ceph_assert(version > keys_ver)
... yite gu

12/16/2022

04:44 PM Bug #58304 (Fix Under Review): pybind: ioctx.get_omap_keys asserts if start_after parameter is no...
Igor Fedotov
04:30 PM Bug #58304 (In Progress): pybind: ioctx.get_omap_keys asserts if start_after parameter is non-empty
Igor Fedotov
04:29 PM Bug #58304 (Pending Backport): pybind: ioctx.get_omap_keys asserts if start_after parameter is no...
Igor Fedotov

12/15/2022

10:51 PM Bug #51652: heartbeat timeouts on filestore OSDs while deleting objects in upgrade:pacific-p2p-pa...
/a/yuriw-2022-12-14_15:40:37-upgrade:pacific-p2p-pacific_16.2.11_RC-distro-default-smithi/7116495 Laura Flores
10:45 PM Bug #58289 (New): "AssertionError: wait_for_recovery: failed before timeout expired" from down pg...
/a/yuriw-2022-12-13_15:58:24-upgrade:pacific-p2p-pacific_16.2.11_RC-distro-default-smithi/7114849... Laura Flores
09:49 PM Bug #56034: qa/standalone/osd/divergent-priors.sh fails in test TEST_divergent_3()
/a/ksirivad-2022-12-15_06:28:05-rados-wip-ksirivad-testing-main-distro-default-smithi/7118004/ Kamoltat (Junior) Sirivadhna
08:45 PM Bug #58288 (Resolved): quincy: mon: pg_num_check() according to crush rule
Corresponding BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2153654
Introduced here in Q: https://github.com/cep...
Matan Breizman
05:33 PM Bug #53789 (Fix Under Review): CommandFailedError (rados/test_python.sh): "RADOS object not found...
Radoslaw Zarzynski
05:04 PM Bug #53789: CommandFailedError (rados/test_python.sh): "RADOS object not found" causes test_rados...
Hypothesis no 1: the issue is a fallout from 65d05fdd579d21dd57b72b1d9148380bc6074269 (PR https://github.com/ceph/cep... Radoslaw Zarzynski
04:24 PM Bug #53575 (Resolved): Valgrind reports memory "Leak_PossiblyLost" errors concerning lib64
Laura Flores
04:14 PM Bug #53575: Valgrind reports memory "Leak_PossiblyLost" errors concerning lib64
https://github.com/ceph/ceph/pull/48641 merged Yuri Weinstein
04:14 PM Bug #57751: LibRadosAio.SimpleWritePP hang and pkill
https://github.com/ceph/ceph/pull/48641 merged Yuri Weinstein
04:14 PM Bug #52136: Valgrind reports memory "Leak_DefinitelyLost" errors.
https://github.com/ceph/ceph/pull/48641 merged Yuri Weinstein
02:39 PM Fix #57963 (Pending Backport): osd: Misleading information displayed for the running configuratio...
Sridhar Seshasayee
02:37 PM Fix #57963 (Resolved): osd: Misleading information displayed for the running configuration of osd...
Sridhar Seshasayee
01:58 PM Bug #58281: osd:memory usage exceeds the osd_memory_target
Please note that osd_memory_target is not a hard limit. It's just 'target' OSD usage that OSD attempts to align with.... Igor Fedotov
11:21 AM Bug #58281: osd:memory usage exceeds the osd_memory_target
ceph daemon osd.1 perf dump... yite gu
11:16 AM Bug #58281: osd:memory usage exceeds the osd_memory_target
... yite gu
11:06 AM Bug #58281 (Rejected): osd:memory usage exceeds the osd_memory_target
I want to limit osd_memory_target to 3758096384 bytes... yite gu
11:17 AM Bug #57529 (Resolved): mclock backfill is getting higher priority than WPQ
Sridhar Seshasayee
11:16 AM Backport #58273 (Resolved): quincy: mclock backfill is getting higher priority than WPQ
Sridhar Seshasayee

12/14/2022

07:12 PM Backport #58273 (In Progress): quincy: mclock backfill is getting higher priority than WPQ
Sridhar Seshasayee
07:08 PM Backport #58273 (Resolved): quincy: mclock backfill is getting higher priority than WPQ
https://github.com/ceph/ceph/pull/49437 Backport Bot
06:59 PM Bug #57529 (Pending Backport): mclock backfill is getting higher priority than WPQ
Neha Ojha
04:52 PM Bug #58178: FAILED ceph_assert(last_e.version.version < e.version.version)
I can not, sorry. I reported the issue as soon as I saw it, waited a day after it showed up, then reformatted the dri... Kevin Fox
03:20 PM Bug #58178: FAILED ceph_assert(last_e.version.version < e.version.version)
@Kevin Fox, can you please share the failing osd logs with osd debug 20?
We will suppose to print the previous log e...
Nitzan Mordechai
02:21 AM Bug #58218: osd
https://github.com/ceph/ceph/pull/40441 yite gu

12/13/2022

09:14 PM Bug #56785: crash: void OSDShard::register_and_wake_split_child(PG*): assert(!slot->waiting_for_s...
Only 4 occurrences of this crash in the wild, but let's keep an eye on this since now we have a test that reproduced it. Laura Flores
09:13 PM Bug #56785: crash: void OSDShard::register_and_wake_split_child(PG*): assert(!slot->waiting_for_s...
/a/yuriw-2022-12-10_00:03:28-rados-wip-yuri7-testing-2022-12-09-1107-quincy-distro-default-smithi/7111159 Laura Flores
03:59 PM Bug #51729: Upmap verification fails for multi-level crush rule
Update on this Tracker: I am discussing this scenario with Josh Salomon, someone who is very knowledgeable about bala... Laura Flores
05:10 AM Backport #58260 (In Progress): pacific: rados: fix extra tabs on warning for pool copy
Shreyansh Sancheti
04:45 AM Backport #58260 (Rejected): pacific: rados: fix extra tabs on warning for pool copy
https://github.com/ceph/ceph/pull/49400 Backport Bot
05:09 AM Backport #58259 (In Progress): quincy: rados: fix extra tabs on warning for pool copy
Shreyansh Sancheti
04:45 AM Backport #58259 (Rejected): quincy: rados: fix extra tabs on warning for pool copy
https://github.com/ceph/ceph/pull/49399 Backport Bot
04:40 AM Bug #58165 (Pending Backport): rados: fix extra tabs on warning for pool copy
Shreyansh Sancheti

12/12/2022

11:35 PM Bug #58239: pacific: src/mon/Monitor.cc: FAILED ceph_assert(osdmon()->is_writeable())
Analyzing the coredump:
Looking at the backtrace (same as above, but here, the frames are numbered)......
Laura Flores
04:08 PM Bug #58239: pacific: src/mon/Monitor.cc: FAILED ceph_assert(osdmon()->is_writeable())
The failure reproduced 7 times out of 50. Laura Flores
07:07 PM Bug #52129: LibRadosWatchNotify.AioWatchDelete failed
/a/yuriw-2022-12-07_15:47:33-rados-wip-yuri-testing-2022-12-06-1204-distro-default-smithi/7106771 Laura Flores
07:01 PM Bug #58165: rados: fix extra tabs on warning for pool copy
https://github.com/ceph/ceph/pull/49251 merged Yuri Weinstein
05:33 PM Bug #57989: test-erasure-eio.sh fails since pg is not in unfound
Looks releated,
/a/yuriw-2022-12-09_22:27:10-rados-main-distro-default-smithi/7110655/...
Matan Breizman
05:27 PM Cleanup #58149 (Resolved): Clarify pool creation failure message due to exceeding max_pgs_per_osd
Laura Flores
05:26 PM Bug #58173 (Resolved): api_aio_pp: failure on LibRadosAio.SimplePoolEIOFlag and LibRadosAio.PoolE...
Laura Flores
05:01 PM Bug #57546: rados/thrash-erasure-code: wait_for_recovery timeout due to "active+clean+remapped+la...
@Radek it's also still in main Laura Flores
05:00 PM Bug #57546: rados/thrash-erasure-code: wait_for_recovery timeout due to "active+clean+remapped+la...
/a/yuriw-2022-12-07_15:48:38-rados-wip-yuri3-testing-2022-12-06-1211-distro-default-smithi/7106890 Laura Flores

12/11/2022

03:08 PM Bug #58240 (Fix Under Review): osd/scrub: modifying osd_deep_scrub_stride while pg is doing deep ...
Modify osd_deep_scrub_stride(e.g., 512KiB to 1MiB) while some pgs are doing deep scrub and then check the status of p... Zhansong Gao

12/09/2022

11:04 PM Bug #58239: pacific: src/mon/Monitor.cc: FAILED ceph_assert(osdmon()->is_writeable())
First seen in https://github.com/ceph/ceph/pull/48803
I scheduled some tests to run over the weekend so we can see...
Laura Flores
10:36 PM Bug #58239 (Resolved): pacific: src/mon/Monitor.cc: FAILED ceph_assert(osdmon()->is_writeable())
This is not deterministic, but when ``run_osd`` in qa/standalone/ceph-helpers.sh can result in timeout when trying to... Kamoltat (Junior) Sirivadhna
08:34 PM Bug #58052: Empty Pool (zero objects) shows usage.
Alright, please let me know when you have the file so I can remove it from my drive:
https://drive.google.com/file/d...
Brian Woods
08:26 PM Bug #55851 (Resolved): Assert in Ceph messenger
Konstantin Shalygin
08:25 PM Backport #57258 (Resolved): pacific: Assert in Ceph messenger
Konstantin Shalygin
06:49 PM Backport #57258: pacific: Assert in Ceph messenger
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/48255
merged
Yuri Weinstein
08:03 PM Bug #55355 (Resolved): osd thread deadlock
Konstantin Shalygin
08:00 PM Backport #56722 (Resolved): pacific: osd thread deadlock
Konstantin Shalygin
06:49 PM Backport #56722: pacific: osd thread deadlock
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/48254
merged
Yuri Weinstein
05:16 PM Bug #56028: thrash_cache_writeback_proxy_none: FAILED ceph_assert(version == old_value.version) i...
/a/yuriw-2022-12-08_15:36:34-rados-wip-yuri2-testing-2022-12-07-0821-pacific-distro-default-smithi/7108597... Laura Flores
05:05 PM Bug #56770: crash: void OSDShard::register_and_wake_split_child(PG*): assert(p != pg_slots.end())
We should address this crash, as it's been seen in 7 clusters over 100 times. Laura Flores
05:03 PM Bug #56770: crash: void OSDShard::register_and_wake_split_child(PG*): assert(p != pg_slots.end())
/a/yuriw-2022-12-08_15:36:34-rados-wip-yuri2-testing-2022-12-07-0821-pacific-distro-default-smithi/7108558... Laura Flores
04:48 PM Backport #58116 (Resolved): pacific: qa/workunits/rados/test_librados_build.sh: specify redirect ...
Laura Flores
08:35 AM Bug #56772: crash: uint64_t SnapSet::get_clone_bytes(snapid_t) const: assert(clone_overlap.count(...
Hi, Would it be possible to raise the priority of this bug to High (as well as #57940), as this prevent the incomplet... Thomas Le Gentil

12/08/2022

10:43 PM Backport #58039 (Resolved): pacific: osd: add created_at and ceph_version_when_created metadata
Igor Fedotov
08:59 PM Backport #58039: pacific: osd: add created_at and ceph_version_when_created metadata
https://github.com/ceph/ceph/pull/49144 merged Yuri Weinstein
06:57 PM Bug #48896: osd/OSDMap.cc: FAILED ceph_assert(osd_weight.count(i.first))
/a/yuriw-2022-12-02_14:50:43-rados-wip-yuri8-testing-2022-12-01-0905-pacific-distro-default-smithi/7101371 Laura Flores
04:48 PM Bug #17170: mon/monclient: update "unable to obtain rotating service keys when osd init" to sugge...
https://github.com/ceph/ceph/pull/48318 merged Yuri Weinstein
04:31 PM Bug #53789: CommandFailedError (rados/test_python.sh): "RADOS object not found" causes test_rados...
/a/yuriw-2022-12-06_15:43:07-rados-wip-yuri8-testing-2022-12-05-1031-pacific-distro-default-smithi/7105473$ Laura Flores
04:07 PM Bug #58098 (Fix Under Review): qa/workunits/rados/test_crash.sh: crashes are never posted
It seems reasonable to me! Laura Flores
01:10 AM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
Laura Flores wrote:
> Can we make the default behavior a ceph user, and then provide --setgroup and --setuser option...
Tim Serong
09:27 AM Bug #58218: osd
OSD: crash... yite gu
09:26 AM Bug #58218 (Duplicate): osd
yite gu
07:55 AM Backport #58214 (In Progress): quincy: osd: Improve osd bench accuracy by using buffers with rand...
Sridhar Seshasayee
06:15 AM Backport #58214 (Resolved): quincy: osd: Improve osd bench accuracy by using buffers with random ...
https://github.com/ceph/ceph/pull/49323 Backport Bot
06:03 AM Fix #57577 (Pending Backport): osd: Improve osd bench accuracy by using buffers with random patterns
Sridhar Seshasayee
12:56 AM Bug #58052: Empty Pool (zero objects) shows usage.
Alright, I found the logs can be accessed from docker itself. In the process of pulling them, but I am already at 5G... Brian Woods

12/07/2022

10:37 PM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
Can we make the default behavior a ceph user, and then provide --setgroup and --setuser options in case we need to re... Laura Flores
11:54 AM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
Still waiting for that build (debuginfo seems to take an unbelievably long time to publish...)
Meanwhile, I did a ...
Tim Serong
05:51 AM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
I've just rerun "rados/singleton/{all/test-crash mon_election/connectivity msgr-failures/few msgr/async objectstore/b... Tim Serong
05:30 AM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
(Sorry, I didn't mean to update any of those fields with my previous comment) Tim Serong
02:34 AM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
Thanks Laura, I'll try to figure out what's going on. So far, looking at the journal log, the keyring must be OK, or... Tim Serong
03:00 PM Bug #57546: rados/thrash-erasure-code: wait_for_recovery timeout due to "active+clean+remapped+la...
Sent a PR for quincy: https://github.com/ceph/ceph/pull/49304. Radoslaw Zarzynski
01:47 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
I'm having the same BT in my tests:
/a/nmordech-2022-12-06_13:26:40-rados:thrash-erasure-code-wip-nitzan-peering-aut...
Nitzan Mordechai
01:34 PM Bug #56371 (Duplicate): crash: MOSDPGLog::encode_payload(unsigned long)
Radoslaw Zarzynski
12:04 PM Bug #58130: LibRadosAio.SimpleWrite hang and pkill
Laura, i think it is different than that bug (57751), in that case all the osds are still up.
We can see that we nev...
Nitzan Mordechai
09:57 AM Backport #58006 (In Progress): quincy: bail from handle_command() if _generate_command_map() fails
Ilya Dryomov
09:57 AM Backport #58007 (In Progress): pacific: bail from handle_command() if _generate_command_map() fails
Ilya Dryomov

12/06/2022

05:30 PM Bug #58098 (New): qa/workunits/rados/test_crash.sh: crashes are never posted
Laura Flores wrote:
> I scheduled some tests here with the reverts committed to see if they pass: http://pulpito.fro...
Laura Flores
03:48 PM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
I scheduled some tests here with the reverts committed to see if they pass: http://pulpito.front.sepia.ceph.com/lflor... Laura Flores
03:41 PM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
Yes, there's one available at /a/yuriw-2022-11-23_15:09:06-rados-wip-yuri10-testing-2022-11-22-1711-distro-default-sm... Laura Flores
06:11 AM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
Is there a way to view the journalctl-b0.gz archive from the failed runs? Because if ceph-crash can't post crashes o... Tim Serong
11:53 AM Backport #58186 (In Progress): quincy: osd: Misleading information displayed for the running conf...
Sridhar Seshasayee
11:45 AM Backport #58186 (Resolved): quincy: osd: Misleading information displayed for the running configu...
https://github.com/ceph/ceph/pull/49281 Backport Bot
11:43 AM Fix #57963 (Pending Backport): osd: Misleading information displayed for the running configuratio...
Sridhar Seshasayee
10:02 AM Bug #58173 (Fix Under Review): api_aio_pp: failure on LibRadosAio.SimplePoolEIOFlag and LibRadosA...
Matan Breizman
05:09 AM Bug #57937: pg autoscaler of rgw pools doesn't work after creating otp pool
This problem was fixed in Rook v1.10.2. I updated my Rook/Ceph cluster to v1.10.5 and confirmed that this problem dis... Satoru Takeuchi
03:07 AM Bug #58182 (Fix Under Review): Suicide when osd bootup timeout
When the osd is started, if a message is lost, the OSD is stuck in the startup phase.
Restart the osd node through t...
Yao Wu
01:37 AM Bug #58106: when a large number of error ops appear in the OSDs,pglog does not trim.
Radoslaw Zarzynski wrote:
> Hello!
>
> what is on disk is actually serialized from the the in-memory representati...
王子敬 wang
12:09 AM Bug #51945: qa/workunits/mon/caps.sh: Error: Expected return 13, got 0
/a/yuriw-2022-11-28_16:10:10-rados-wip-yuri6-testing-2022-11-23-1348-distro-default-smithi/7093588 Laura Flores

12/05/2022

11:37 PM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
https://shaman.ceph.com/builds/ceph/wip-revert-pr-48713/2b583578473c82604cfdab2faef9f161dc2fb0b9/ Laura Flores
11:20 PM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
The bug reproduced on Yuri's test branch. The difference between the test branch and the main SHA is that the test br... Laura Flores
07:23 PM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
Laura Flores wrote:
> Scheduled 50x tests to run here: http://pulpito.front.sepia.ceph.com/lflores-2022-12-05_17:05:...
Laura Flores
07:22 PM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
I have a feeling that the tests I scheduled earlier on the main branch all passed since the SHA it picked up is older... Laura Flores
07:14 PM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
Wondering if there could have been a regression caused by https://github.com/ceph/ceph/pull/48713. Laura Flores
06:38 PM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
/a/yuriw-2022-11-28_21:26:12-rados-wip-yuri7-testing-2022-11-18-1548-distro-default-smithi/7095988
/a/lflores-2022-1...
Laura Flores
04:17 PM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
Scheduled 50x tests to run here: http://pulpito.front.sepia.ceph.com/lflores-2022-12-05_17:05:59-rados-wip-yuri10-tes... Laura Flores
04:10 PM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
Three recent instances of this bug in the main branch point to a regression. My next steps here will be to schedule m... Laura Flores
10:46 PM Bug #58052: Empty Pool (zero objects) shows usage.
That is every log file from every node. There are no ceph-mgr* logs. :/
Even from inside the docker on the adm n...
Brian Woods
06:33 PM Bug #58052: Empty Pool (zero objects) shows usage.
Hello. Thanks for response and the files.... Radoslaw Zarzynski
09:11 PM Bug #58173: api_aio_pp: failure on LibRadosAio.SimplePoolEIOFlag and LibRadosAio.PoolEIOFlag
Building a branch here with https://github.com/ceph/ceph/pull/49029 reverted, which can be used to verify whether it ... Laura Flores
09:03 PM Bug #58173: api_aio_pp: failure on LibRadosAio.SimplePoolEIOFlag and LibRadosAio.PoolEIOFlag
Excuse my update Sam, I see you already added it as a duplicate. Laura Flores
08:55 PM Bug #58173: api_aio_pp: failure on LibRadosAio.SimplePoolEIOFlag and LibRadosAio.PoolEIOFlag
Matan added that test within the last two weeks: https://github.com/ceph/ceph/pull/49029 Samuel Just
07:10 PM Bug #58173 (Resolved): api_aio_pp: failure on LibRadosAio.SimplePoolEIOFlag and LibRadosAio.PoolE...
The workunits/rados/test.sh script is run in the orch suite on some tests. In a few of them, these two tests were fai... Adam King
08:06 PM Bug #58178: FAILED ceph_assert(last_e.version.version < e.version.version)
Noticed an osd, doing this, on a cluster over the weekend. Its been crashing consistently since. Kevin Fox
08:05 PM Bug #58178 (Need More Info): FAILED ceph_assert(last_e.version.version < e.version.version)
debug -4> 2022-12-05T19:14:03.556+0000 7fe51028a200 5 osd.57 pg_epoch: 261349 pg[1.573( v 261349'617978754 (2613... Kevin Fox
07:07 PM Bug #56733: Since Pacific upgrade, sporadic latencies plateau on random OSD/disks
I've just let Mark and Ronen know about this issue. Radoslaw Zarzynski
07:05 PM Bug #58156: Monitors do not permit OSD to join after upgrading to Quincy
Radoslaw Zarzynski wrote:
> Hi Igor! What was the intermediary version during the upgrade? We merged https://github....
Igor Fedotov
06:40 PM Bug #58156: Monitors do not permit OSD to join after upgrading to Quincy
Hi Igor! What was the intermediary version during the upgrade? We merged https://github.com/ceph/ceph/pull/44090 but ... Radoslaw Zarzynski
07:00 PM Bug #58142 (In Progress): rbd-python snaps-many-objects: deep-scrub : stat mismatch
Moving to @In progress@ basing the core standup 1 Dec. Radoslaw Zarzynski
06:56 PM Bug #58106: when a large number of error ops appear in the OSDs,pglog does not trim.
Hello!
what is on disk is actually serialized from the the in-memory representation. We don't see huge numbers of ...
Radoslaw Zarzynski
06:24 PM Bug #58166 (Need More Info): mon:DAEMON_OLD_VERSION newer versions is considered older than earlier
If your cluster is in the same state, can you please share mon logs with debug_mon=20? The following code snippet in ... Neha Ojha
02:53 PM Bug #58166: mon:DAEMON_OLD_VERSION newer versions is considered older than earlier
This was probably introduced in https://github.com/ceph/ceph/pull/36759 Tobias Urdin
02:52 PM Bug #58166 (Need More Info): mon:DAEMON_OLD_VERSION newer versions is considered older than earlier
We have a cluster with most mon/mgr/osd are running 16.2.10 and some OSDs are running 16.2.9
The healthcheck does ...
Tobias Urdin
06:24 PM Backport #58169 (Resolved): quincy: extra debugs for: [mon] high cpu usage by fn_monstore thread
https://github.com/ceph/ceph/pull/50406 Backport Bot
06:16 PM Feature #58168 (Pending Backport): extra debugs for: [mon] high cpu usage by fn_monstore thread
Radoslaw Zarzynski
06:16 PM Feature #58168 (Pending Backport): extra debugs for: [mon] high cpu usage by fn_monstore thread
Radoslaw Zarzynski
06:10 PM Bug #53806: unessesarily long laggy PG state
> I think as long as `acting` does not have duplicate entries, the logic is exactly the same as before.
Yeah. I'm ...
Radoslaw Zarzynski
05:51 PM Backport #55768: pacific: rados_api_tests: LibRadosWatchNotify.AioWatchNotify2 fails
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/46499
merged
Yuri Weinstein
05:34 PM Backport #56648: quincy: [Progress] Do not show NEW PG_NUM value for pool if autoscaler is set to...
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/47925
merged
Yuri Weinstein
05:15 PM Fix #57963: osd: Misleading information displayed for the running configuration of osd_mclock_max...
https://github.com/ceph/ceph/pull/48708 merged Yuri Weinstein
05:12 PM Bug #57782: [mon] high cpu usage by fn_monstore thread
Radoslaw Zarzynski wrote:
> NOT A FIX (extra debugs): https://github.com/ceph/ceph/pull/48513
merged
Yuri Weinstein
04:02 PM Bug #58165 (Fix Under Review): rados: fix extra tabs on warning for pool copy
Laura Flores
12:57 PM Bug #58165 (Resolved): rados: fix extra tabs on warning for pool copy
BZ link: https://bugzilla.redhat.com/show_bug.cgi?id=2148242 Shreyansh Sancheti
03:52 PM Bug #57632 (Fix Under Review): test_envlibrados_for_rocksdb: free(): invalid pointer
Laura Flores
07:37 AM Bug #57940: ceph osd crashes with FAILED ceph_assert(clone_overlap.count(clone)) when nobackfill ...
Thomas Le Gentil wrote:
> I could avoid this crash by removing all pg for which ceph could not get the clone_bytes, ...
Thomas Le Gentil

12/04/2022

11:56 AM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
/a/yuriw-2022-11-28_21:13:47-rados-wip-yuri11-testing-2022-11-18-1506-distro-default-smithi/7095031/ Matan Breizman
11:46 AM Bug #53789: CommandFailedError (rados/test_python.sh): "RADOS object not found" causes test_rados...
/a/yuriw-2022-11-23_21:36:17-rados-wip-yuri11-testing-2022-11-18-1506-distro-default-smithi/7089814/ Matan Breizman
09:41 AM Backport #58144 (In Progress): pacific: mon/MonCommands: Support dump_historic_slow_ops
Matan Breizman
09:37 AM Backport #58143 (In Progress): quincy: mon/MonCommands: Support dump_historic_slow_ops
Matan Breizman

12/02/2022

09:49 PM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
In a passed job, the crashes are posted:... Laura Flores
09:33 PM Bug #58098 (In Progress): qa/workunits/rados/test_crash.sh: crashes are never posted
In the job that passed, the mgr.server reports a recent crash:
/a/lflores-2022-11-30_22:53:49-rados-main-distro-de...
Laura Flores
09:06 PM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
In one of the jobs that passed, the OSDs were also failed for 31 seconds, but this time, the crashes were detected. S... Laura Flores
09:02 PM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
Didn't reproduce in the 20x run above, but it did reproduce a second time here:
/a/yuriw-2022-11-28_21:09:37-rados...
Laura Flores
06:09 PM Bug #58052: Empty Pool (zero objects) shows usage.
Attaching server2 to this message.
Brian Woods
06:09 PM Bug #58052: Empty Pool (zero objects) shows usage.
I am realizing those logs are from a single host (server4).
server3 got removed today.
Attaching server1 to this me...
Brian Woods
05:42 PM Bug #58052: Empty Pool (zero objects) shows usage.
Radoslaw Zarzynski wrote:
> Well, I think the command you mentioned did effect for RGW, not MGR. I'm providing the c...
Brian Woods
03:28 PM Bug #58156 (In Progress): Monitors do not permit OSD to join after upgrading to Quincy
Igor Fedotov
03:28 PM Bug #58156 (Resolved): Monitors do not permit OSD to join after upgrading to Quincy
The Nautilus cluster has been eventually upgraded to Quincy and at the end OSDs stopped joining the cluster.
The i...
Igor Fedotov
03:24 PM Bug #58155 (Resolved): mon:ceph_assert(m < ranks.size()) `different code path than tracker 50089`
Same problem with https://tracker.ceph.com/issues/50089, but it is a different code path.
We opened a new tracker ...
Kamoltat (Junior) Sirivadhna
01:31 AM Bug #58106: when a large number of error ops appear in the OSDs,pglog does not trim.
Nitzan Mordechai wrote:
> 王子敬 wang wrote:
> > Nitzan Mordechai wrote:
> > > Since you attached part of the pglog, ...
王子敬 wang
01:06 AM Bug #57632: test_envlibrados_for_rocksdb: free(): invalid pointer
Linked a possible solution for skipping ubuntu with this test. I scheduled a teuthology test for it, which I will use... Laura Flores

12/01/2022

09:44 PM Bug #58130: LibRadosAio.SimpleWrite hang and pkill
Thanks for your observations, Brad! I'm going to dedicate this Tracker to `LibRadosAio.SimpleWrite` and mark it as re... Laura Flores
09:20 PM Bug #58130: LibRadosAio.SimpleWrite hang and pkill
The issue appears to be in the api_aio test as it gets started but doesn't complete.... Brad Hubbard
08:04 PM Bug #58130: LibRadosAio.SimpleWrite hang and pkill
Ran into another instance of this here:
/a/yuriw-2022-11-30_23:13:27-rados-wip-yuri2-testing-2022-11-30-0724-pacif...
Laura Flores
09:43 PM Bug #57618: rados/test.sh hang and pkilled (LibRadosWatchNotifyEC.WatchNotify)
/a/yuriw-2022-11-29_22:29:58-rados-wip-yuri10-testing-2022-11-29-1005-pacific-distro-default-smithi/7097464/ Laura Flores
09:23 PM Bug #57751: LibRadosAio.SimpleWritePP hang and pkill
possibly 58130 is related Brad Hubbard
07:30 PM Cleanup #58149 (Resolved): Clarify pool creation failure message due to exceeding max_pgs_per_osd
This was inspired by the Re: [ceph-users] proxmox hyperconverged pg calculations in ceph pacific, pve 7.2 thread.
Anthony D'Atri
07:30 PM Bug #50089 (Resolved): mon/MonMap.h: FAILED ceph_assert(m < ranks.size()) when reducing number of...
Kamoltat (Junior) Sirivadhna
06:59 PM Bug #50089 (New): mon/MonMap.h: FAILED ceph_assert(m < ranks.size()) when reducing number of moni...
Kamoltat (Junior) Sirivadhna
04:12 PM Backport #58144 (Resolved): pacific: mon/MonCommands: Support dump_historic_slow_ops
https://github.com/ceph/ceph/pull/49233 Backport Bot
04:12 PM Backport #58143 (Resolved): quincy: mon/MonCommands: Support dump_historic_slow_ops
https://github.com/ceph/ceph/pull/49232 Backport Bot
04:02 PM Bug #58141 (Pending Backport): mon/MonCommands: Support dump_historic_slow_ops
Matan Breizman
12:42 PM Bug #58141 (Resolved): mon/MonCommands: Support dump_historic_slow_ops
Slow ops are being tracked in the mon while `dump_historic_slow_ops` command is not registered:
```
$ ceph daemon ....
Matan Breizman
03:56 PM Bug #58142 (In Progress): rbd-python snaps-many-objects: deep-scrub : stat mismatch
... Matan Breizman
03:45 PM Bug #56733: Since Pacific upgrade, sporadic latencies plateau on random OSD/disks
It seems more like generic RADOS issue. Adam Kupczyk
12:27 PM Bug #57757 (Fix Under Review): ECUtil: terminate called after throwing an instance of 'ceph::buff...
Nitzan Mordechai
08:18 AM Bug #58106: when a large number of error ops appear in the OSDs,pglog does not trim.
王子敬 wang wrote:
> Nitzan Mordechai wrote:
> > Since you attached part of the pglog, i can't see how many entries yo...
Nitzan Mordechai
01:50 AM Bug #58106: when a large number of error ops appear in the OSDs,pglog does not trim.
Nitzan Mordechai wrote:
> Since you attached part of the pglog, i can't see how many entries you have for log and ho...
王子敬 wang
03:41 AM Bug #53806: unessesarily long laggy PG state
Radoslaw Zarzynski wrote:
> OK, Aishwarya has found in testing that the @break@-related commit (https://github.com/c...
玮文 胡
12:51 AM Backport #58040: quincy: osd: add created_at and ceph_version_when_created metadata
please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/49159
ceph-backport.sh versi...
Kaoru Esashika

11/30/2022

11:15 PM Bug #58132 (In Progress): qa/standalone/mon: --mon-initial-members setting causes us to populate ...
Kamoltat (Junior) Sirivadhna
11:08 PM Bug #58132 (Resolved): qa/standalone/mon: --mon-initial-members setting causes us to populate rem...
Problem:
--mon-initial-members does nothing but cause monmap
to populate ``removed_ranks`` because the way we sta...
Kamoltat (Junior) Sirivadhna
10:57 PM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
Neha suggested we see how reproducible this is, so as not to mask any underlying problems by sleeping longer. I sched... Laura Flores
10:34 PM Bug #58130 (In Progress): LibRadosAio.SimpleWrite hang and pkill
A rados api test experienced a failure after the last global tests had successfully run.
/a/yuriw-2022-11-29_22:29...
Laura Flores
07:31 PM Bug #58052: Empty Pool (zero objects) shows usage.
Well, I think the command you mentioned did effect for RGW, not MGR. I'm providing the commands increasing log verbos... Radoslaw Zarzynski
07:25 PM Bug #57977: osd:tick checking mon for new map
The issue during the upgrade looks awfully similar to a downstream Prashant has working on.
Prashant, would find som...
Radoslaw Zarzynski
07:09 PM Bug #58106 (Need More Info): when a large number of error ops appear in the OSDs,pglog does not t...
Radoslaw Zarzynski
10:43 AM Bug #58106: when a large number of error ops appear in the OSDs,pglog does not trim.
Since you attached part of the pglog, i can't see how many entries you have for log and how many for dups
can you pl...
Nitzan Mordechai
08:38 AM Bug #58106: when a large number of error ops appear in the OSDs,pglog does not trim.
王子敬 wang wrote:
> Nitzan Mordechai wrote:
> > @王子敬 wang, can you please send us the output for one of the pgs from ...
王子敬 wang
08:32 AM Bug #58106: when a large number of error ops appear in the OSDs,pglog does not trim.
Nitzan Mordechai wrote:
> @王子敬 wang, can you please send us the output for one of the pgs from ceph-objectstore-tool...
王子敬 wang
07:30 AM Bug #58106: when a large number of error ops appear in the OSDs,pglog does not trim.
@王子敬 wang, can you please send us the output for one of the pgs from ceph-objectstore-tool?... Nitzan Mordechai
02:16 AM Bug #58106: when a large number of error ops appear in the OSDs,pglog does not trim.
Nitzan Mordechai wrote:
> @王子敬 wang can you please provide the output of 'ceph pg dump' ?
ok, the output in the pg_...
王子敬 wang
07:07 PM Bug #57546: rados/thrash-erasure-code: wait_for_recovery timeout due to "active+clean+remapped+la...
I think the invariant here is that the @acting@ container should not have duplicates. If it is broken, we have a more... Radoslaw Zarzynski
01:55 PM Bug #57546: rados/thrash-erasure-code: wait_for_recovery timeout due to "active+clean+remapped+la...
If there are indeed duplicated entries in the acting set, should there be a 'break' at all in this loop? It seems lik... Joshua Baergen
07:00 PM Bug #53806: unessesarily long laggy PG state
OK, Aishwarya has found in testing that the @break@-related commit (https://github.com/ceph/ceph/pull/44499/commits/9... Radoslaw Zarzynski
02:02 PM Bug #53806: unessesarily long laggy PG state
FWIW, we've seen this happen very frequently during Nautilus->{Octopus,Pacific} upgrades. I had just tracked down the... Joshua Baergen
03:36 PM Bug #58114 (Closed): mon: FAILED ceph_assert(rank == new_rank)
Close due to this issue is found pre-merge testing from PR: https://github.com/ceph/ceph/pull/48698/ Kamoltat (Junior) Sirivadhna
04:14 AM Backport #58039: pacific: osd: add created_at and ceph_version_when_created metadata
please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/49144
ceph-backport.sh versi...
Kaoru Esashika

11/29/2022

11:18 PM Bug #54438: test/objectstore/store_test.cc: FAILED ceph_assert(bl_eq(state->contents[noid].data, ...
/a/yuriw-2022-11-28_16:28:53-rados-wip-yuri-testing-2022-11-18-1500-pacific-distro-default-smithi/7094026 Laura Flores
07:14 PM Backport #58117 (In Progress): quincy: qa/workunits/rados/test_librados_build.sh: specify redirec...
https://github.com/ceph/ceph/pull/49140 Laura Flores
06:58 PM Backport #58117 (In Progress): quincy: qa/workunits/rados/test_librados_build.sh: specify redirec...
Backport Bot
07:11 PM Backport #58116 (In Progress): pacific: qa/workunits/rados/test_librados_build.sh: specify redire...
https://github.com/ceph/ceph/pull/49139 Laura Flores
06:58 PM Backport #58116 (Resolved): pacific: qa/workunits/rados/test_librados_build.sh: specify redirect ...
Backport Bot
06:52 PM Bug #58046 (Pending Backport): qa/workunits/rados/test_librados_build.sh: specify redirect in cur...
Laura Flores
05:37 PM Bug #58046: qa/workunits/rados/test_librados_build.sh: specify redirect in curl command
Seen in Pacific run: /a/yuriw-2022-11-28_21:10:48-rados-wip-yuri10-testing-2022-11-28-1042-pacific-distro-default-smi... Aishwarya Mathuria
05:52 PM Bug #57632: test_envlibrados_for_rocksdb: free(): invalid pointer
We discussed this tracker in the RADOS meeting. Sam pointed out that this set of tests doesn't have any actual users,... Laura Flores
05:24 PM Bug #58114 (Closed): mon: FAILED ceph_assert(rank == new_rank)
/a/yuriw-2022-11-28_21:10:48-rados-wip-yuri10-testing-2022-11-28-1042-pacific-distro-default-smithi/7095280/remote/sm... Aishwarya Mathuria
04:59 PM Bug #44595: cache tiering: Error: oid 48 copy_from 493 returned error code -2
... Aishwarya Mathuria
03:05 PM Bug #58107: mon-stretch: old stretch_marked_down_mons leads to ceph unresponsive
Therefore, there is nothing we can do but wait for the other site to come back up, so pgs can complete peering and th... Kamoltat (Junior) Sirivadhna
03:04 PM Bug #58107 (Closed): mon-stretch: old stretch_marked_down_mons leads to ceph unresponsive
Closed due to this is not a corner case but quote from Greg Farnum:
``it’s that electing those two monitors means ...
Kamoltat (Junior) Sirivadhna
04:15 AM Bug #58107 (In Progress): mon-stretch: old stretch_marked_down_mons leads to ceph unresponsive
Kamoltat (Junior) Sirivadhna
04:14 AM Bug #58107 (Closed): mon-stretch: old stretch_marked_down_mons leads to ceph unresponsive
h1. How to reproduce the issue
h2. Set up:
mon.a (zone 1) rank=0
mon.b (zone 1) rank=1
mon.c (zone 2) rank=2
...
Kamoltat (Junior) Sirivadhna
01:07 PM Bug #58106: when a large number of error ops appear in the OSDs,pglog does not trim.
@王子敬 wang can you please provide the output of 'ceph pg dump' ? Nitzan Mordechai
01:42 AM Bug #58106 (Need More Info): when a large number of error ops appear in the OSDs,pglog does not t...
When We use the s3 interface append and copy of the object gateway, a large number of error ops appear in the OSDs wh... 王子敬 wang
11:12 AM Bug #57940: ceph osd crashes with FAILED ceph_assert(clone_overlap.count(clone)) when nobackfill ...
I could avoid this crash by removing all pg for which ceph could not get the clone_bytes, except the one I was sure t... Thomas Le Gentil
09:02 AM Backport #57496 (Resolved): quincy: Invalid read of size 8 in handle_recovery_delete()
Nitzan Mordechai
07:05 AM Bug #50042 (Fix Under Review): rados/test.sh: api_watch_notify failures
Nitzan Mordechai

11/28/2022

10:24 PM Bug #58098 (Fix Under Review): qa/workunits/rados/test_crash.sh: crashes are never posted
Laura Flores
05:34 PM Bug #58098 (Resolved): qa/workunits/rados/test_crash.sh: crashes are never posted
/a/yuriw-2022-11-23_15:09:06-rados-wip-yuri10-testing-2022-11-22-1711-distro-default-smithi/7087281... Laura Flores
09:43 PM Bug #56733: Since Pacific upgrade, sporadic latencies plateau on random OSD/disks
Just a follow-up.
Finally, what's helping us the best is increasing osd_scrub_sleep to 0.4.
Gilles Mocellin
02:47 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
Aishwarya Mathuria wrote:
> We suspect that this assert failure is hit in cases when we try to encode a message befo...
Ben Gao
05:05 AM Support #58091 (New): osd: reduce default value of osd_heartbeat_grace
Client io hang 20s when peer osd ping failure, 20s is too long. In case of network jitter, it generally does not exce... yite gu

11/24/2022

03:54 AM Bug #57977: osd:tick checking mon for new map
The more I dig, the more I'm thinking that this might be some race to do with noup, and probably has nothing to do wi... Joshua Baergen
03:42 AM Bug #57977: osd:tick checking mon for new map
Something that's probably worth mentioning - we had noup set in the cluster for each upgrade, and we wait until all O... Joshua Baergen
03:12 AM Bug #57977: osd:tick checking mon for new map
We saw this happen to roughly a dozen OSDs (1-2 per host for some hosts) during a recent upgrade from Nautilus to Pac... Joshua Baergen

11/22/2022

06:17 PM Bug #57977: osd:tick checking mon for new map
I already restart osd daemon, but have no reproduct. If it happens again, I will collect more logs yite gu
03:54 PM Bug #58052: Empty Pool (zero objects) shows usage.
Radoslaw Zarzynski wrote:
> Could you please provide a log from an active mgr with @debug_ms=1@ and @debug_mgr=20@?
...
Brian Woods
 

Also available in: Atom