Project

General

Profile

Activity

From 12/05/2022 to 01/03/2023

01/03/2023

09:12 AM Bug #50462: OSDs crash in osd/osd_types.cc: FAILED ceph_assert(clone_overlap.count(clone))
Justin Mammarella wrote:
> We are seeing this bug in Nautilus 14.2.15 to 14.2.22 replicated pool.
>
> Two of our...
hoan nv
08:43 AM Bug #57699: slow osd boot with valgrind (reached maximum tries (50) after waiting for 300 seconds)
Sergii Kuzko wrote:
> Hi
> Can you update the bug status
> Or transfer to the group of the current version 17.2.6
Sergii Kuzko
08:41 AM Bug #57699: slow osd boot with valgrind (reached maximum tries (50) after waiting for 300 seconds)
Hi
Can you update the bug status
Or transfer to the group of the current version 17.2.5
Sergii Kuzko
06:01 AM Documentation #58374 (Resolved): crushtool flags remain undocumented in the crushtool manpage
>2023-01-01: brad@danga.com: https://docs.ceph.com/en/quincy/man/8/crushtool/ seems out of date. I'm running the quin... Zac Dover

12/31/2022

08:56 AM Bug #58052: Empty Pool (zero objects) shows usage.
Any thoughts on this? Brian Woods

12/29/2022

08:55 AM Bug #58370 (Need More Info): OSD crash
... yite gu

12/28/2022

11:26 AM Bug #21592: LibRadosCWriteOps.CmpExt got 0 instead of -4095-1
/a/ksirivad-2022-12-22_17:58:01-rados-wip-ksirivad-testing-pacific-distro-default-smithi/7125137/ Nitzan Mordechai
11:26 AM Bug #58130: LibRadosAio.SimpleWrite hang and pkill
Kamoltat (Junior) Sirivadhna wrote:
> /a/ksirivad-2022-12-22_17:58:01-rados-wip-ksirivad-testing-pacific-distro-defa...
Nitzan Mordechai

12/26/2022

07:46 AM Bug #58305: src/mon/AuthMonitor.cc: FAILED ceph_assert(version > keys_ver)
Radoslaw Zarzynski wrote:
> Thanks for the log file! Would you be able to try to replicate with higher debugs levels...
yite gu
04:54 AM Bug #58356 (Need More Info): osd:segmentation fault in tcmalloc's ReleaseToCentralCache
osd crash. Program terminated with signal SIGSEGV, Segmentation fault Detailed stack information is in the attachme... 王子敬 wang
04:52 AM Bug #58130: LibRadosAio.SimpleWrite hang and pkill
/a/ksirivad-2022-12-22_17:58:01-rados-wip-ksirivad-testing-pacific-distro-default-smithi/7125137/ Kamoltat (Junior) Sirivadhna
03:50 AM Bug #58355 (Need More Info): OSD: segmentation fault in tc_newarray
The osd core appears when cosbench uploads data. Detailed stack information is in the attachment, in Bluestore::_write. 王子敬 wang

12/24/2022

10:10 AM Documentation #58354 (Resolved): doc/ceph-volume/lvm/encryption.rst is inaccurate -- LUKS version...
Stefan Kooman's email of 20 Dec 2022 to dev@ceph.io, bearing the subject line "ceph-volume questions / enhancements",... Zac Dover

12/21/2022

10:16 PM Bug #57546 (Fix Under Review): rados/thrash-erasure-code: wait_for_recovery timeout due to "activ...
Radoslaw Zarzynski
06:57 PM Bug #57546: rados/thrash-erasure-code: wait_for_recovery timeout due to "active+clean+remapped+la...
I guess in main we should revert the opposite commit as both are there. Radoslaw Zarzynski
09:27 PM Backport #58117: quincy: qa/workunits/rados/test_librados_build.sh: specify redirect in curl command
I know the backport is in progress but dumping this here just for reference.
/a/ksirivad-2022-12-21_15:23:02-rados...
Kamoltat (Junior) Sirivadhna
08:09 PM Backport #58337 (In Progress): pacific: mon-stretched_cluster: degraded stretched mode lead to Mo...
Neha Ojha
08:06 PM Backport #58337 (Rejected): pacific: mon-stretched_cluster: degraded stretched mode lead to Monit...
Original backport https://github.com/ceph/ceph/pull/48803 was reverted in https://github.com/ceph/ceph/pull/49412 due... Neha Ojha
08:07 PM Backport #58338 (In Progress): quincy: mon-stretched_cluster: degraded stretched mode lead to Mon...
Neha Ojha
08:06 PM Backport #58338 (Resolved): quincy: mon-stretched_cluster: degraded stretched mode lead to Monito...
https://github.com/ceph/ceph/pull/48802 Neha Ojha
07:50 PM Bug #58052: Empty Pool (zero objects) shows usage.
Radoslaw Zarzynski wrote:
> Glad you've found it! Would mind uploading via the @ceph-post-file@ (https://docs.ceph.c...
Brian Woods
07:10 PM Bug #58052: Empty Pool (zero objects) shows usage.
Glad you've found it! Would mind uploading via the @ceph-post-file@ (https://docs.ceph.com/en/quincy/man/8/ceph-post-... Radoslaw Zarzynski
07:33 PM Bug #58155 (Fix Under Review): mon:ceph_assert(m < ranks.size()) `different code path than tracke...
Radoslaw Zarzynski
07:32 PM Bug #58305: src/mon/AuthMonitor.cc: FAILED ceph_assert(version > keys_ver)
Thanks for the log file! Would you be able to try to replicate with higher debugs levels?
Perhaps something like: ...
Radoslaw Zarzynski
07:26 PM Bug #58106: when a large number of error ops appear in the OSDs,pglog does not trim.
Well, values around 600-900 kitems aren't looking very large to me. Definitely they are much, much smaller than anyth... Radoslaw Zarzynski
07:23 PM Backport #58336 (Resolved): pacific: qa/standalone/mon: --mon-initial-members setting causes us t...
Backport Bot
07:23 PM Backport #58335 (Resolved): quincy: qa/standalone/mon: --mon-initial-members setting causes us to...
Backport Bot
07:20 PM Bug #57937 (Rejected): pg autoscaler of rgw pools doesn't work after creating otp pool
Not a Ceph issue per the last comment. Radoslaw Zarzynski
07:19 PM Bug #58130: LibRadosAio.SimpleWrite hang and pkill
Basing comment #14 we have a "fix candidate" that might also with issue.
If that's correct, we may wait for merging ...
Radoslaw Zarzynski
07:16 PM Bug #58132 (Pending Backport): qa/standalone/mon: --mon-initial-members setting causes us to popu...
Radoslaw Zarzynski
07:16 PM Backport #58334 (Resolved): quincy: mon/monclient: update "unable to obtain rotating service keys...
https://github.com/ceph/ceph/pull/50405 Backport Bot
07:16 PM Backport #58333 (Rejected): pacific: mon/monclient: update "unable to obtain rotating service key...
https://github.com/ceph/ceph/pull/54556 Backport Bot
07:14 PM Bug #17170 (Pending Backport): mon/monclient: update "unable to obtain rotating service keys when...
Radoslaw Zarzynski
07:13 PM Bug #48896: osd/OSDMap.cc: FAILED ceph_assert(osd_weight.count(i.first))
Low due to the low occurrence frequency. Radoslaw Zarzynski
06:58 PM Bug #58240 (Fix Under Review): osd/scrub: modifying osd_deep_scrub_stride while pg is doing deep ...
Radoslaw Zarzynski
06:51 PM Bug #58239 (Resolved): pacific: src/mon/Monitor.cc: FAILED ceph_assert(osdmon()->is_writeable())
Radoslaw Zarzynski
06:50 PM Bug #57017: mon-stretched_cluster: degraded stretched mode lead to Monitor crash
The pacific backport got reverted in https://github.com/ceph/ceph/pull/49412. Radoslaw Zarzynski
06:44 PM Bug #51729 (In Progress): Upmap verification fails for multi-level crush rule
Radoslaw Zarzynski
03:25 PM Bug #57105: quincy: ceph osd pool set <pool> size math error
This was fixed in main https://github.com/ceph/ceph/pull/44430 but was not backported to Q.
Instead of backporting t...
Matan Breizman
10:50 AM Bug #57105 (Fix Under Review): quincy: ceph osd pool set <pool> size math error
This PR is proposed after a BZ was reporting the same issue.
Matan Breizman
11:22 AM Bug #58288: quincy: mon: pg_num_check() according to crush rule
After the revert is merged (https://github.com/ceph/ceph/pull/49465),
pg_num_check() will return to not taking the c...
Matan Breizman
10:53 AM Bug #54188: Setting too many PGs leads error handling overflow
Setting this tracker a duplicate. Seems like the same issue, and 57105 proposed PR should address this one as well. Matan Breizman

12/20/2022

09:57 PM Bug #47025: rados/test.sh: api_watch_notify_pp LibRadosWatchNotifyECPP.WatchNotify failed
https://github.com/ceph/ceph/pull/49109/commits/31750d5e8ae5f64edf934e2350dfa3c98df68b5a Brad Hubbard
09:56 PM Bug #47025 (Fix Under Review): rados/test.sh: api_watch_notify_pp LibRadosWatchNotifyECPP.WatchNo...
Brad Hubbard
12:06 PM Backport #58315 (In Progress): quincy: Valgrind reports memory "Leak_DefinitelyLost" errors.
Nitzan Mordechai
12:04 PM Backport #58314 (In Progress): pacific: Valgrind reports memory "Leak_DefinitelyLost" errors.
Nitzan Mordechai
11:56 AM Bug #58305: src/mon/AuthMonitor.cc: FAILED ceph_assert(version > keys_ver)
Radoslaw Zarzynski wrote:
> Thanks for the report! Do have a corresponding log or coredump by any chance?
This lo...
yite gu
09:23 AM Bug #58316: Ceph health metric Scraping still broken
BTW this is the output of @smartctl -a --json@ on the device:... Janek Bevendorff
09:17 AM Bug #58316 (New): Ceph health metric Scraping still broken
This was brought up in #46285 already, but the issue has been marked as rejected.
When I run @ceph device scrape-h...
Janek Bevendorff

12/19/2022

07:08 PM Backport #58315 (Resolved): quincy: Valgrind reports memory "Leak_DefinitelyLost" errors.
https://github.com/ceph/ceph/pull/49522 Backport Bot
07:08 PM Backport #58314 (Resolved): pacific: Valgrind reports memory "Leak_DefinitelyLost" errors.
https://github.com/ceph/ceph/pull/49521 Backport Bot
07:00 PM Bug #58218 (Duplicate): osd
Radoslaw Zarzynski
06:59 PM Bug #58178 (Need More Info): FAILED ceph_assert(last_e.version.version < e.version.version)
Radoslaw Zarzynski
06:59 PM Bug #52136 (Pending Backport): Valgrind reports memory "Leak_DefinitelyLost" errors.
Radoslaw Zarzynski
06:58 PM Bug #57751 (Resolved): LibRadosAio.SimpleWritePP hang and pkill
Radoslaw Zarzynski
06:56 PM Bug #58288 (In Progress): quincy: mon: pg_num_check() according to crush rule
Just updating the tracker's state to fit the reality. Radoslaw Zarzynski
06:47 PM Bug #51652: heartbeat timeouts on filestore OSDs while deleting objects in upgrade:pacific-p2p-pa...
Lowered the priority as @FileStore` is not only deprecated but also being removed right now. Radoslaw Zarzynski
06:40 PM Bug #58305 (Need More Info): src/mon/AuthMonitor.cc: FAILED ceph_assert(version > keys_ver)
Thanks for the report! Do have a corresponding log or coredump by any chance? Radoslaw Zarzynski
05:06 PM Documentation #46126: RGW docs lack an explanation of how permissions management works, especiall...
Sure, very much appreciated.
Matt
Matt Benjamin
05:03 PM Documentation #46126: RGW docs lack an explanation of how permissions management works, especiall...
Matt,
I don't mean to endorse dirtwash's rudeness. I mean to capture an impassioned--if inelegant and abusive--req...
Zac Dover
10:01 AM Bug #58281 (Rejected): osd:memory usage exceeds the osd_memory_target
Igor Fedotov
03:56 AM Bug #58281: osd:memory usage exceeds the osd_memory_target
Igor Fedotov wrote:
> Please note that osd_memory_target is not a hard limit. It's just 'target' OSD usage that OSD ...
yite gu

12/17/2022

07:47 PM Bug #58305 (Need More Info): src/mon/AuthMonitor.cc: FAILED ceph_assert(version > keys_ver)
... yite gu

12/16/2022

04:44 PM Bug #58304 (Fix Under Review): pybind: ioctx.get_omap_keys asserts if start_after parameter is no...
Igor Fedotov
04:30 PM Bug #58304 (In Progress): pybind: ioctx.get_omap_keys asserts if start_after parameter is non-empty
Igor Fedotov
04:29 PM Bug #58304 (Pending Backport): pybind: ioctx.get_omap_keys asserts if start_after parameter is no...
Igor Fedotov

12/15/2022

10:51 PM Bug #51652: heartbeat timeouts on filestore OSDs while deleting objects in upgrade:pacific-p2p-pa...
/a/yuriw-2022-12-14_15:40:37-upgrade:pacific-p2p-pacific_16.2.11_RC-distro-default-smithi/7116495 Laura Flores
10:45 PM Bug #58289 (New): "AssertionError: wait_for_recovery: failed before timeout expired" from down pg...
/a/yuriw-2022-12-13_15:58:24-upgrade:pacific-p2p-pacific_16.2.11_RC-distro-default-smithi/7114849... Laura Flores
09:49 PM Bug #56034: qa/standalone/osd/divergent-priors.sh fails in test TEST_divergent_3()
/a/ksirivad-2022-12-15_06:28:05-rados-wip-ksirivad-testing-main-distro-default-smithi/7118004/ Kamoltat (Junior) Sirivadhna
08:45 PM Bug #58288 (Resolved): quincy: mon: pg_num_check() according to crush rule
Corresponding BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2153654
Introduced here in Q: https://github.com/cep...
Matan Breizman
05:33 PM Bug #53789 (Fix Under Review): CommandFailedError (rados/test_python.sh): "RADOS object not found...
Radoslaw Zarzynski
05:04 PM Bug #53789: CommandFailedError (rados/test_python.sh): "RADOS object not found" causes test_rados...
Hypothesis no 1: the issue is a fallout from 65d05fdd579d21dd57b72b1d9148380bc6074269 (PR https://github.com/ceph/cep... Radoslaw Zarzynski
04:24 PM Bug #53575 (Resolved): Valgrind reports memory "Leak_PossiblyLost" errors concerning lib64
Laura Flores
04:14 PM Bug #53575: Valgrind reports memory "Leak_PossiblyLost" errors concerning lib64
https://github.com/ceph/ceph/pull/48641 merged Yuri Weinstein
04:14 PM Bug #57751: LibRadosAio.SimpleWritePP hang and pkill
https://github.com/ceph/ceph/pull/48641 merged Yuri Weinstein
04:14 PM Bug #52136: Valgrind reports memory "Leak_DefinitelyLost" errors.
https://github.com/ceph/ceph/pull/48641 merged Yuri Weinstein
02:39 PM Fix #57963 (Pending Backport): osd: Misleading information displayed for the running configuratio...
Sridhar Seshasayee
02:37 PM Fix #57963 (Resolved): osd: Misleading information displayed for the running configuration of osd...
Sridhar Seshasayee
01:58 PM Bug #58281: osd:memory usage exceeds the osd_memory_target
Please note that osd_memory_target is not a hard limit. It's just 'target' OSD usage that OSD attempts to align with.... Igor Fedotov
11:21 AM Bug #58281: osd:memory usage exceeds the osd_memory_target
ceph daemon osd.1 perf dump... yite gu
11:16 AM Bug #58281: osd:memory usage exceeds the osd_memory_target
... yite gu
11:06 AM Bug #58281 (Rejected): osd:memory usage exceeds the osd_memory_target
I want to limit osd_memory_target to 3758096384 bytes... yite gu
11:17 AM Bug #57529 (Resolved): mclock backfill is getting higher priority than WPQ
Sridhar Seshasayee
11:16 AM Backport #58273 (Resolved): quincy: mclock backfill is getting higher priority than WPQ
Sridhar Seshasayee

12/14/2022

07:12 PM Backport #58273 (In Progress): quincy: mclock backfill is getting higher priority than WPQ
Sridhar Seshasayee
07:08 PM Backport #58273 (Resolved): quincy: mclock backfill is getting higher priority than WPQ
https://github.com/ceph/ceph/pull/49437 Backport Bot
06:59 PM Bug #57529 (Pending Backport): mclock backfill is getting higher priority than WPQ
Neha Ojha
04:52 PM Bug #58178: FAILED ceph_assert(last_e.version.version < e.version.version)
I can not, sorry. I reported the issue as soon as I saw it, waited a day after it showed up, then reformatted the dri... Kevin Fox
03:20 PM Bug #58178: FAILED ceph_assert(last_e.version.version < e.version.version)
@Kevin Fox, can you please share the failing osd logs with osd debug 20?
We will suppose to print the previous log e...
Nitzan Mordechai
02:21 AM Bug #58218: osd
https://github.com/ceph/ceph/pull/40441 yite gu

12/13/2022

09:14 PM Bug #56785: crash: void OSDShard::register_and_wake_split_child(PG*): assert(!slot->waiting_for_s...
Only 4 occurrences of this crash in the wild, but let's keep an eye on this since now we have a test that reproduced it. Laura Flores
09:13 PM Bug #56785: crash: void OSDShard::register_and_wake_split_child(PG*): assert(!slot->waiting_for_s...
/a/yuriw-2022-12-10_00:03:28-rados-wip-yuri7-testing-2022-12-09-1107-quincy-distro-default-smithi/7111159 Laura Flores
03:59 PM Bug #51729: Upmap verification fails for multi-level crush rule
Update on this Tracker: I am discussing this scenario with Josh Salomon, someone who is very knowledgeable about bala... Laura Flores
05:10 AM Backport #58260 (In Progress): pacific: rados: fix extra tabs on warning for pool copy
Shreyansh Sancheti
04:45 AM Backport #58260 (Rejected): pacific: rados: fix extra tabs on warning for pool copy
https://github.com/ceph/ceph/pull/49400 Backport Bot
05:09 AM Backport #58259 (In Progress): quincy: rados: fix extra tabs on warning for pool copy
Shreyansh Sancheti
04:45 AM Backport #58259 (Rejected): quincy: rados: fix extra tabs on warning for pool copy
https://github.com/ceph/ceph/pull/49399 Backport Bot
04:40 AM Bug #58165 (Pending Backport): rados: fix extra tabs on warning for pool copy
Shreyansh Sancheti

12/12/2022

11:35 PM Bug #58239: pacific: src/mon/Monitor.cc: FAILED ceph_assert(osdmon()->is_writeable())
Analyzing the coredump:
Looking at the backtrace (same as above, but here, the frames are numbered)......
Laura Flores
04:08 PM Bug #58239: pacific: src/mon/Monitor.cc: FAILED ceph_assert(osdmon()->is_writeable())
The failure reproduced 7 times out of 50. Laura Flores
07:07 PM Bug #52129: LibRadosWatchNotify.AioWatchDelete failed
/a/yuriw-2022-12-07_15:47:33-rados-wip-yuri-testing-2022-12-06-1204-distro-default-smithi/7106771 Laura Flores
07:01 PM Bug #58165: rados: fix extra tabs on warning for pool copy
https://github.com/ceph/ceph/pull/49251 merged Yuri Weinstein
05:33 PM Bug #57989: test-erasure-eio.sh fails since pg is not in unfound
Looks releated,
/a/yuriw-2022-12-09_22:27:10-rados-main-distro-default-smithi/7110655/...
Matan Breizman
05:27 PM Cleanup #58149 (Resolved): Clarify pool creation failure message due to exceeding max_pgs_per_osd
Laura Flores
05:26 PM Bug #58173 (Resolved): api_aio_pp: failure on LibRadosAio.SimplePoolEIOFlag and LibRadosAio.PoolE...
Laura Flores
05:01 PM Bug #57546: rados/thrash-erasure-code: wait_for_recovery timeout due to "active+clean+remapped+la...
@Radek it's also still in main Laura Flores
05:00 PM Bug #57546: rados/thrash-erasure-code: wait_for_recovery timeout due to "active+clean+remapped+la...
/a/yuriw-2022-12-07_15:48:38-rados-wip-yuri3-testing-2022-12-06-1211-distro-default-smithi/7106890 Laura Flores

12/11/2022

03:08 PM Bug #58240 (Fix Under Review): osd/scrub: modifying osd_deep_scrub_stride while pg is doing deep ...
Modify osd_deep_scrub_stride(e.g., 512KiB to 1MiB) while some pgs are doing deep scrub and then check the status of p... Zhansong Gao

12/09/2022

11:04 PM Bug #58239: pacific: src/mon/Monitor.cc: FAILED ceph_assert(osdmon()->is_writeable())
First seen in https://github.com/ceph/ceph/pull/48803
I scheduled some tests to run over the weekend so we can see...
Laura Flores
10:36 PM Bug #58239 (Resolved): pacific: src/mon/Monitor.cc: FAILED ceph_assert(osdmon()->is_writeable())
This is not deterministic, but when ``run_osd`` in qa/standalone/ceph-helpers.sh can result in timeout when trying to... Kamoltat (Junior) Sirivadhna
08:34 PM Bug #58052: Empty Pool (zero objects) shows usage.
Alright, please let me know when you have the file so I can remove it from my drive:
https://drive.google.com/file/d...
Brian Woods
08:26 PM Bug #55851 (Resolved): Assert in Ceph messenger
Konstantin Shalygin
08:25 PM Backport #57258 (Resolved): pacific: Assert in Ceph messenger
Konstantin Shalygin
06:49 PM Backport #57258: pacific: Assert in Ceph messenger
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/48255
merged
Yuri Weinstein
08:03 PM Bug #55355 (Resolved): osd thread deadlock
Konstantin Shalygin
08:00 PM Backport #56722 (Resolved): pacific: osd thread deadlock
Konstantin Shalygin
06:49 PM Backport #56722: pacific: osd thread deadlock
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/48254
merged
Yuri Weinstein
05:16 PM Bug #56028: thrash_cache_writeback_proxy_none: FAILED ceph_assert(version == old_value.version) i...
/a/yuriw-2022-12-08_15:36:34-rados-wip-yuri2-testing-2022-12-07-0821-pacific-distro-default-smithi/7108597... Laura Flores
05:05 PM Bug #56770: crash: void OSDShard::register_and_wake_split_child(PG*): assert(p != pg_slots.end())
We should address this crash, as it's been seen in 7 clusters over 100 times. Laura Flores
05:03 PM Bug #56770: crash: void OSDShard::register_and_wake_split_child(PG*): assert(p != pg_slots.end())
/a/yuriw-2022-12-08_15:36:34-rados-wip-yuri2-testing-2022-12-07-0821-pacific-distro-default-smithi/7108558... Laura Flores
04:48 PM Backport #58116 (Resolved): pacific: qa/workunits/rados/test_librados_build.sh: specify redirect ...
Laura Flores
08:35 AM Bug #56772: crash: uint64_t SnapSet::get_clone_bytes(snapid_t) const: assert(clone_overlap.count(...
Hi, Would it be possible to raise the priority of this bug to High (as well as #57940), as this prevent the incomplet... Thomas Le Gentil

12/08/2022

10:43 PM Backport #58039 (Resolved): pacific: osd: add created_at and ceph_version_when_created metadata
Igor Fedotov
08:59 PM Backport #58039: pacific: osd: add created_at and ceph_version_when_created metadata
https://github.com/ceph/ceph/pull/49144 merged Yuri Weinstein
06:57 PM Bug #48896: osd/OSDMap.cc: FAILED ceph_assert(osd_weight.count(i.first))
/a/yuriw-2022-12-02_14:50:43-rados-wip-yuri8-testing-2022-12-01-0905-pacific-distro-default-smithi/7101371 Laura Flores
04:48 PM Bug #17170: mon/monclient: update "unable to obtain rotating service keys when osd init" to sugge...
https://github.com/ceph/ceph/pull/48318 merged Yuri Weinstein
04:31 PM Bug #53789: CommandFailedError (rados/test_python.sh): "RADOS object not found" causes test_rados...
/a/yuriw-2022-12-06_15:43:07-rados-wip-yuri8-testing-2022-12-05-1031-pacific-distro-default-smithi/7105473$ Laura Flores
04:07 PM Bug #58098 (Fix Under Review): qa/workunits/rados/test_crash.sh: crashes are never posted
It seems reasonable to me! Laura Flores
01:10 AM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
Laura Flores wrote:
> Can we make the default behavior a ceph user, and then provide --setgroup and --setuser option...
Tim Serong
09:27 AM Bug #58218: osd
OSD: crash... yite gu
09:26 AM Bug #58218 (Duplicate): osd
yite gu
07:55 AM Backport #58214 (In Progress): quincy: osd: Improve osd bench accuracy by using buffers with rand...
Sridhar Seshasayee
06:15 AM Backport #58214 (Resolved): quincy: osd: Improve osd bench accuracy by using buffers with random ...
https://github.com/ceph/ceph/pull/49323 Backport Bot
06:03 AM Fix #57577 (Pending Backport): osd: Improve osd bench accuracy by using buffers with random patterns
Sridhar Seshasayee
12:56 AM Bug #58052: Empty Pool (zero objects) shows usage.
Alright, I found the logs can be accessed from docker itself. In the process of pulling them, but I am already at 5G... Brian Woods

12/07/2022

10:37 PM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
Can we make the default behavior a ceph user, and then provide --setgroup and --setuser options in case we need to re... Laura Flores
11:54 AM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
Still waiting for that build (debuginfo seems to take an unbelievably long time to publish...)
Meanwhile, I did a ...
Tim Serong
05:51 AM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
I've just rerun "rados/singleton/{all/test-crash mon_election/connectivity msgr-failures/few msgr/async objectstore/b... Tim Serong
05:30 AM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
(Sorry, I didn't mean to update any of those fields with my previous comment) Tim Serong
02:34 AM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
Thanks Laura, I'll try to figure out what's going on. So far, looking at the journal log, the keyring must be OK, or... Tim Serong
03:00 PM Bug #57546: rados/thrash-erasure-code: wait_for_recovery timeout due to "active+clean+remapped+la...
Sent a PR for quincy: https://github.com/ceph/ceph/pull/49304. Radoslaw Zarzynski
01:47 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
I'm having the same BT in my tests:
/a/nmordech-2022-12-06_13:26:40-rados:thrash-erasure-code-wip-nitzan-peering-aut...
Nitzan Mordechai
01:34 PM Bug #56371 (Duplicate): crash: MOSDPGLog::encode_payload(unsigned long)
Radoslaw Zarzynski
12:04 PM Bug #58130: LibRadosAio.SimpleWrite hang and pkill
Laura, i think it is different than that bug (57751), in that case all the osds are still up.
We can see that we nev...
Nitzan Mordechai
09:57 AM Backport #58006 (In Progress): quincy: bail from handle_command() if _generate_command_map() fails
Ilya Dryomov
09:57 AM Backport #58007 (In Progress): pacific: bail from handle_command() if _generate_command_map() fails
Ilya Dryomov

12/06/2022

05:30 PM Bug #58098 (New): qa/workunits/rados/test_crash.sh: crashes are never posted
Laura Flores wrote:
> I scheduled some tests here with the reverts committed to see if they pass: http://pulpito.fro...
Laura Flores
03:48 PM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
I scheduled some tests here with the reverts committed to see if they pass: http://pulpito.front.sepia.ceph.com/lflor... Laura Flores
03:41 PM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
Yes, there's one available at /a/yuriw-2022-11-23_15:09:06-rados-wip-yuri10-testing-2022-11-22-1711-distro-default-sm... Laura Flores
06:11 AM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
Is there a way to view the journalctl-b0.gz archive from the failed runs? Because if ceph-crash can't post crashes o... Tim Serong
11:53 AM Backport #58186 (In Progress): quincy: osd: Misleading information displayed for the running conf...
Sridhar Seshasayee
11:45 AM Backport #58186 (Resolved): quincy: osd: Misleading information displayed for the running configu...
https://github.com/ceph/ceph/pull/49281 Backport Bot
11:43 AM Fix #57963 (Pending Backport): osd: Misleading information displayed for the running configuratio...
Sridhar Seshasayee
10:02 AM Bug #58173 (Fix Under Review): api_aio_pp: failure on LibRadosAio.SimplePoolEIOFlag and LibRadosA...
Matan Breizman
05:09 AM Bug #57937: pg autoscaler of rgw pools doesn't work after creating otp pool
This problem was fixed in Rook v1.10.2. I updated my Rook/Ceph cluster to v1.10.5 and confirmed that this problem dis... Satoru Takeuchi
03:07 AM Bug #58182 (Fix Under Review): Suicide when osd bootup timeout
When the osd is started, if a message is lost, the OSD is stuck in the startup phase.
Restart the osd node through t...
Yao Wu
01:37 AM Bug #58106: when a large number of error ops appear in the OSDs,pglog does not trim.
Radoslaw Zarzynski wrote:
> Hello!
>
> what is on disk is actually serialized from the the in-memory representati...
王子敬 wang
12:09 AM Bug #51945: qa/workunits/mon/caps.sh: Error: Expected return 13, got 0
/a/yuriw-2022-11-28_16:10:10-rados-wip-yuri6-testing-2022-11-23-1348-distro-default-smithi/7093588 Laura Flores

12/05/2022

11:37 PM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
https://shaman.ceph.com/builds/ceph/wip-revert-pr-48713/2b583578473c82604cfdab2faef9f161dc2fb0b9/ Laura Flores
11:20 PM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
The bug reproduced on Yuri's test branch. The difference between the test branch and the main SHA is that the test br... Laura Flores
07:23 PM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
Laura Flores wrote:
> Scheduled 50x tests to run here: http://pulpito.front.sepia.ceph.com/lflores-2022-12-05_17:05:...
Laura Flores
07:22 PM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
I have a feeling that the tests I scheduled earlier on the main branch all passed since the SHA it picked up is older... Laura Flores
07:14 PM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
Wondering if there could have been a regression caused by https://github.com/ceph/ceph/pull/48713. Laura Flores
06:38 PM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
/a/yuriw-2022-11-28_21:26:12-rados-wip-yuri7-testing-2022-11-18-1548-distro-default-smithi/7095988
/a/lflores-2022-1...
Laura Flores
04:17 PM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
Scheduled 50x tests to run here: http://pulpito.front.sepia.ceph.com/lflores-2022-12-05_17:05:59-rados-wip-yuri10-tes... Laura Flores
04:10 PM Bug #58098: qa/workunits/rados/test_crash.sh: crashes are never posted
Three recent instances of this bug in the main branch point to a regression. My next steps here will be to schedule m... Laura Flores
10:46 PM Bug #58052: Empty Pool (zero objects) shows usage.
That is every log file from every node. There are no ceph-mgr* logs. :/
Even from inside the docker on the adm n...
Brian Woods
06:33 PM Bug #58052: Empty Pool (zero objects) shows usage.
Hello. Thanks for response and the files.... Radoslaw Zarzynski
09:11 PM Bug #58173: api_aio_pp: failure on LibRadosAio.SimplePoolEIOFlag and LibRadosAio.PoolEIOFlag
Building a branch here with https://github.com/ceph/ceph/pull/49029 reverted, which can be used to verify whether it ... Laura Flores
09:03 PM Bug #58173: api_aio_pp: failure on LibRadosAio.SimplePoolEIOFlag and LibRadosAio.PoolEIOFlag
Excuse my update Sam, I see you already added it as a duplicate. Laura Flores
08:55 PM Bug #58173: api_aio_pp: failure on LibRadosAio.SimplePoolEIOFlag and LibRadosAio.PoolEIOFlag
Matan added that test within the last two weeks: https://github.com/ceph/ceph/pull/49029 Samuel Just
07:10 PM Bug #58173 (Resolved): api_aio_pp: failure on LibRadosAio.SimplePoolEIOFlag and LibRadosAio.PoolE...
The workunits/rados/test.sh script is run in the orch suite on some tests. In a few of them, these two tests were fai... Adam King
08:06 PM Bug #58178: FAILED ceph_assert(last_e.version.version < e.version.version)
Noticed an osd, doing this, on a cluster over the weekend. Its been crashing consistently since. Kevin Fox
08:05 PM Bug #58178 (Need More Info): FAILED ceph_assert(last_e.version.version < e.version.version)
debug -4> 2022-12-05T19:14:03.556+0000 7fe51028a200 5 osd.57 pg_epoch: 261349 pg[1.573( v 261349'617978754 (2613... Kevin Fox
07:07 PM Bug #56733: Since Pacific upgrade, sporadic latencies plateau on random OSD/disks
I've just let Mark and Ronen know about this issue. Radoslaw Zarzynski
07:05 PM Bug #58156: Monitors do not permit OSD to join after upgrading to Quincy
Radoslaw Zarzynski wrote:
> Hi Igor! What was the intermediary version during the upgrade? We merged https://github....
Igor Fedotov
06:40 PM Bug #58156: Monitors do not permit OSD to join after upgrading to Quincy
Hi Igor! What was the intermediary version during the upgrade? We merged https://github.com/ceph/ceph/pull/44090 but ... Radoslaw Zarzynski
07:00 PM Bug #58142 (In Progress): rbd-python snaps-many-objects: deep-scrub : stat mismatch
Moving to @In progress@ basing the core standup 1 Dec. Radoslaw Zarzynski
06:56 PM Bug #58106: when a large number of error ops appear in the OSDs,pglog does not trim.
Hello!
what is on disk is actually serialized from the the in-memory representation. We don't see huge numbers of ...
Radoslaw Zarzynski
06:24 PM Bug #58166 (Need More Info): mon:DAEMON_OLD_VERSION newer versions is considered older than earlier
If your cluster is in the same state, can you please share mon logs with debug_mon=20? The following code snippet in ... Neha Ojha
02:53 PM Bug #58166: mon:DAEMON_OLD_VERSION newer versions is considered older than earlier
This was probably introduced in https://github.com/ceph/ceph/pull/36759 Tobias Urdin
02:52 PM Bug #58166 (Need More Info): mon:DAEMON_OLD_VERSION newer versions is considered older than earlier
We have a cluster with most mon/mgr/osd are running 16.2.10 and some OSDs are running 16.2.9
The healthcheck does ...
Tobias Urdin
06:24 PM Backport #58169 (Resolved): quincy: extra debugs for: [mon] high cpu usage by fn_monstore thread
https://github.com/ceph/ceph/pull/50406 Backport Bot
06:16 PM Feature #58168 (Pending Backport): extra debugs for: [mon] high cpu usage by fn_monstore thread
Radoslaw Zarzynski
06:16 PM Feature #58168 (Pending Backport): extra debugs for: [mon] high cpu usage by fn_monstore thread
Radoslaw Zarzynski
06:10 PM Bug #53806: unessesarily long laggy PG state
> I think as long as `acting` does not have duplicate entries, the logic is exactly the same as before.
Yeah. I'm ...
Radoslaw Zarzynski
05:51 PM Backport #55768: pacific: rados_api_tests: LibRadosWatchNotify.AioWatchNotify2 fails
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/46499
merged
Yuri Weinstein
05:34 PM Backport #56648: quincy: [Progress] Do not show NEW PG_NUM value for pool if autoscaler is set to...
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/47925
merged
Yuri Weinstein
05:15 PM Fix #57963: osd: Misleading information displayed for the running configuration of osd_mclock_max...
https://github.com/ceph/ceph/pull/48708 merged Yuri Weinstein
05:12 PM Bug #57782: [mon] high cpu usage by fn_monstore thread
Radoslaw Zarzynski wrote:
> NOT A FIX (extra debugs): https://github.com/ceph/ceph/pull/48513
merged
Yuri Weinstein
04:02 PM Bug #58165 (Fix Under Review): rados: fix extra tabs on warning for pool copy
Laura Flores
12:57 PM Bug #58165 (Resolved): rados: fix extra tabs on warning for pool copy
BZ link: https://bugzilla.redhat.com/show_bug.cgi?id=2148242 Shreyansh Sancheti
03:52 PM Bug #57632 (Fix Under Review): test_envlibrados_for_rocksdb: free(): invalid pointer
Laura Flores
07:37 AM Bug #57940: ceph osd crashes with FAILED ceph_assert(clone_overlap.count(clone)) when nobackfill ...
Thomas Le Gentil wrote:
> I could avoid this crash by removing all pg for which ceph could not get the clone_bytes, ...
Thomas Le Gentil
 

Also available in: Atom