Activity
From 10/18/2022 to 11/16/2022
11/16/2022
- 07:11 PM Bug #57977: osd:tick checking mon for new map
- Thanks for the update! Yeah, it might stuck there. To confirm we would logs with increased debugs (maybe @debug_mon =...
- 07:06 PM Bug #51729: Upmap verification fails for multi-level crush rule
- Thanks for formulating the hypothesis!
Just updating to keep this ticket in the front of the tracker. - 07:02 PM Bug #57546: rados/thrash-erasure-code: wait_for_recovery timeout due to "active+clean+remapped+la...
- Yeah, worth looking the msgr encode issue has the priority.
- 07:00 PM Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
- Discussed during the RADOS Team Meeting on 15 Nov.
Linking the Nitzan's gist: https://gist.github.com/NitzanMordhai/... - 06:58 PM Bug #57989: test-erasure-eio.sh fails since pg is not in unfound
- Definitely a low priority.
- 06:52 PM Bug #58027 (Closed): op slow from throttled to header_read
- Hello! The most important thing is Octopus is EOL. Second, I'm also not sure whether this is really a bug. Seeing 0,5...
- 06:48 PM Bug #57632: test_envlibrados_for_rocksdb: free(): invalid pointer
- Do we know the reason why switching g++11 helps? Is it a known compiler's bug?
- 05:47 PM Bug #57632: test_envlibrados_for_rocksdb: free(): invalid pointer
- I was able to schedule a teuthology run: http://pulpito.front.sepia.ceph.com/lflores-2022-11-16_15:49:13-rados:single...
- 01:11 PM Bug #57940: ceph osd crashes with FAILED ceph_assert(clone_overlap.count(clone)) when nobackfill ...
- the osd process does not crash if it is marked 'out'
11/15/2022
- 08:44 AM Bug #56772: crash: uint64_t SnapSet::get_clone_bytes(snapid_t) const: assert(clone_overlap.count(...
- This bug is present in v17.2.5
- 07:32 AM Bug #58027 (Closed): op slow from throttled to header_read
- ceph version 15.2.7
Op spend 500ms from throttled to header_read... - 12:24 AM Bug #57632: test_envlibrados_for_rocksdb: free(): invalid pointer
- There is also a coredump located at `/a/matan-2022-09-08_11:12:20-rados:singleton-main-distro-default-smithi/7020422/...
- 12:01 AM Bug #57632: test_envlibrados_for_rocksdb: free(): invalid pointer
- Some relevant frames:...
11/14/2022
- 11:39 PM Bug #57632: test_envlibrados_for_rocksdb: free(): invalid pointer
- I followed Brad's ubuntu 20.04 coredump tutorial: https://source.redhat.com/personal_blogs/debugging_a_ceph_osd_cored...
- 08:20 PM Bug #57632: test_envlibrados_for_rocksdb: free(): invalid pointer
- The original build is by now expired, so I'm rebuilding it here: https://shaman.ceph.com/builds/ceph/wip-kefu-testing...
- 08:14 PM Bug #57632: test_envlibrados_for_rocksdb: free(): invalid pointer
- Ran the test locally in an ubuntu 20.04 environment, and the test ran fine.
There is a coredump located under /a/k... - 11:37 AM Bug #55750: mon: slow request of very long time
- {
"description": "osd_failure(failed timeout osd.6 [v2:10.172.98.151:6800/39,v1:10.172.98.151:68...
11/11/2022
- 08:31 PM Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
- Also to note: We set `ceph config set mgr mgr_stats_period 1` on the gibba cluster to reproduce this bug. (This occur...
- 06:27 PM Bug #49689: osd/PeeringState.cc: ceph_abort_msg("past_interval start interval mismatch") start
- I think https://tracker.ceph.com/issues/49689#note-31 makes sense and the following logs also show what max_oldest_ma...
- 10:08 AM Backport #58007: pacific: bail from handle_command() if _generate_command_map() fails
- please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/48846
ceph-backport.sh versi... - 09:07 AM Backport #58007 (Resolved): pacific: bail from handle_command() if _generate_command_map() fails
- https://github.com/ceph/ceph/pull/48846
- 10:03 AM Backport #58006: quincy: bail from handle_command() if _generate_command_map() fails
- please link this Backport tracker issue with GitHub PR https://github.com/ceph/ceph/pull/48845
ceph-backport.sh versi... - 09:07 AM Backport #58006 (Resolved): quincy: bail from handle_command() if _generate_command_map() fails
- https://github.com/ceph/ceph/pull/48845
- 09:01 AM Bug #57859 (Pending Backport): bail from handle_command() if _generate_command_map() fails
- PR https://github.com/ceph/ceph/pull/48044 has been merged in main.
11/10/2022
- 11:37 PM Bug #56101 (Fix Under Review): Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function s...
- 11:21 PM Bug #56101 (In Progress): Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_t...
- 04:52 AM Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
- Thanks for your work in capturing the core Laura.
I had a look at the coredump and it shows exactly what we had sp... - 07:14 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
- /a/yuriw-2022-10-17_17:31:25-rados-wip-yuri7-testing-2022-10-17-0814-distro-default-smithi/7071031
- 11:50 AM Bug #57989: test-erasure-eio.sh fails since pg is not in unfound
- For some reason, the pool already exist...
- 08:44 AM Bug #57757 (In Progress): ECUtil: terminate called after throwing an instance of 'ceph::buffer::v...
- 08:42 AM Bug #57618 (Fix Under Review): rados/test.sh hang and pkilled (LibRadosWatchNotifyEC.WatchNotify)
- 08:34 AM Bug #57618: rados/test.sh hang and pkilled (LibRadosWatchNotifyEC.WatchNotify)
- Some of the OSDs stopped due to valgrind errors. This is duplicate of other bug
- 08:39 AM Bug #57751 (Fix Under Review): LibRadosAio.SimpleWritePP hang and pkill
- 07:38 AM Bug #57546: rados/thrash-erasure-code: wait_for_recovery timeout due to "active+clean+remapped+la...
- Thanks for taking a look Radek! That's a good point since we are seeing this issue with rados/thrash-erasure-code tes...
11/09/2022
- 10:56 PM Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
- Managed to reproduce this on the Gibba cluster and produce a coredump!
The core file is located on gibba001 under ... - 08:18 PM Backport #57704 (Resolved): quincy: mon/MonMap.h: FAILED ceph_assert(m < ranks.size()) when reduc...
- https://github.com/ceph/ceph/pull/48321
- 08:17 PM Backport #57705 (Resolved): pacific: mon/MonMap.h: FAILED ceph_assert(m < ranks.size()) when redu...
- https://github.com/ceph/ceph/pull/48320
- 08:17 PM Bug #50089 (Resolved): mon/MonMap.h: FAILED ceph_assert(m < ranks.size()) when reducing number of...
- 04:34 PM Bug #51729: Upmap verification fails for multi-level crush rule
- Thanks again for looking at this.
I haven't looked further, but I suspect the issue will come down to the variable...
11/08/2022
- 09:23 PM Bug #57017: mon-stretched_cluster: degraded stretched mode lead to Monitor crash
- pacific backport: https://github.com/ceph/ceph/pull/48803
- 08:59 PM Bug #57017: mon-stretched_cluster: degraded stretched mode lead to Monitor crash
- quincy backport: https://github.com/ceph/ceph/pull/48802
- 07:23 PM Bug #51729: Upmap verification fails for multi-level crush rule
- I believe I've reproduced the issue using the osdmaps that Chris provided.
First, I used the osdmaptool to run the... - 02:08 PM Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
- after rechecking the logs it looks like we are taking 2 different versions of smithi01231941-9:head
All chunks with ... - 05:44 AM Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
- @Laura, thanks for confirm that in the coredump, yes, shard0 also showing that when it get the chunk from bluestore:
... - 12:07 AM Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
- Brad and I did some more debugging today.
Here is the end of the log associated with the coredump:...
11/07/2022
- 09:27 PM Bug #57977: osd:tick checking mon for new map
- Radoslaw Zarzynski wrote:
> Octopus is EOL. Does it happen on a supported release?
>
> Regardless of that, could ... - 06:13 PM Bug #57977 (Need More Info): osd:tick checking mon for new map
- Octopus is EOL. Does it happen on a supported release?
Regardless of that, could you please provide logs from this... - 07:30 PM Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
- Also to note, we can see information about argument `to_read` here:...
- 07:27 PM Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
- @Nitzan, what do you think about this analysis? Or are there any other frames/locals you'd like me to check?
- 07:12 PM Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
- Looking at frame 12, I can see that the incorrect length (262144) for shard 0 is evident in the local variable "from"...
- 06:02 PM Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
- Got it to detect the right symbols with the new build!
I will attempt to analyze this coredump at a deeper level, ... - 03:16 PM Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
- According to Brad, the build needs to be as close to the test branch that originally experienced the crash as possibl...
- 07:18 PM Bug #51729: Upmap verification fails for multi-level crush rule
- Thanks Chris! @Radek I have been taking some time to analyze this scenario, and will post updates soon.
- 06:36 PM Bug #51729: Upmap verification fails for multi-level crush rule
- Thanks for the info! Laura, would you mind retaking a look?
- 06:36 PM Bug #51729 (New): Upmap verification fails for multi-level crush rule
- 06:43 PM Bug #50219 (Closed): qa/standalone/erasure-code/test-erasure-eio.sh fails since pg is not in reco...
- The original issue was caused by a commit in a wip branch being tested, so it's highly unprobable it's a reoccurence....
- 06:42 PM Bug #57989 (New): test-erasure-eio.sh fails since pg is not in unfound
- /a/lflores-2022-10-17_18:19:55-rados:standalone-main-distro-default-smithi/7071287...
- 06:35 PM Bug #57845: MOSDRepOp::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_OCTOPUS...
- Likely it's even a duplicate of https://tracker.ceph.com/issues/52657.
- 06:28 PM Bug #52136 (Fix Under Review): Valgrind reports memory "Leak_DefinitelyLost" errors.
- 06:26 PM Bug #57940 (Duplicate): ceph osd crashes with FAILED ceph_assert(clone_overlap.count(clone)) when...
- Looks like a duplicate of 56772.
- 06:24 PM Bug #55141: thrashers/fastread: assertion failure: rollback_info_trimmed_to == head
- Nitzan Mordechai wrote:
> Radoslaw Zarzynski wrote:
> > Well, just found a new occurance.
> Where can i find it?
... - 06:12 PM Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
- Brad and I ran a reproducer on the gibba cluster (restarting OSDs with `for osd in $(systemctl -l |grep osd|gawk '{pr...
- 06:01 PM Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
- Is there any news on that?
- 05:59 PM Bug #49689: osd/PeeringState.cc: ceph_abort_msg("past_interval start interval mismatch") start
- Updated the PR link.
- 01:08 AM Bug #57937: pg autoscaler of rgw pools doesn't work after creating otp pool
- Is there any updates? Please let me know if I can do something.
11/06/2022
- 05:47 AM Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
- @brad, maybe it's a good candidate for another blog for upstream core dump analysis that you talked about (ubuntu 20.04)
11/04/2022
- 07:21 PM Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
- @Brad do you have any tips on how to load the correct debug symbols for the above coredump? After running the `ceph-d...
- 05:48 PM Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
- No luck yet, but I'm trying to set up the right debug environment. So far, gdb is only giving me question marks, but ...
- 06:10 AM Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
- Laura, are you able to use GDB with debuginfo on that coredump file?
- 04:18 PM Bug #57977 (Pending Backport): osd:tick checking mon for new map
- ceph version: 15.2.7
my cluster have a osd down, and it unable join the osdmap.... - 09:17 AM Feature #48392: ceph ignores --keyring?
- This issue is still present in Pacific. Is there any way to work around it except for moving the keys to /etc/ceph?
...
11/03/2022
- 08:56 PM Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
- By the way, I have the coredump saved on the teuthology node under /home/lflores/tracker_57757.
- 03:31 PM Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
- The output Nitzan pasted is from printing ECBackend::read_result_t:
src/osd/ECBackend.cc... - 03:23 PM Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
- Perhaps there is somewhere that the length should be getting updated, but it not?
- 02:27 PM Bug #57969 (New): monitor: ceph -s shows all monitors out of quorum for < 1s
- Ceph -s UI shows all monitors out of quorum for a very short time < 1s.
Issue is like to have no real effect on the ... - 02:42 AM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
- Hi Brad, thanks for all the pointers on the tracker!
I went through the code with Josh and Radek after looking at yo...
11/02/2022
- 03:53 PM Fix #57963 (Fix Under Review): osd: Misleading information displayed for the running configuratio...
- With the fix, the following is shown for an OSD with ssd as the underlying device type:...
- 03:26 PM Fix #57963: osd: Misleading information displayed for the running configuration of osd_mclock_max...
- See BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2111282 for additional information.
- 03:25 PM Fix #57963 (Resolved): osd: Misleading information displayed for the running configuration of osd...
- For the inactive device type(hdd/ssd) of an OSD, the running configuration option osd_mclock_max_capacity_iops_[hdd|s...
- 06:40 AM Bug #57533 (Fix Under Review): Able to modify the mclock reservation, weight and limit parameters...
10/31/2022
- 01:58 PM Bug #53729 (Resolved): ceph-osd takes all memory before oom on boot
- 01:58 PM Backport #55633 (Rejected): octopus: ceph-osd takes all memory before oom on boot
- Octopus is EOL
- 12:41 PM Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
- ...
- 03:57 AM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
- Still trying to run a test with added debugging due to the ongoing infra issues but I noticed that Coverity CID 15096...
10/29/2022
- 10:06 PM Documentation #46126: RGW docs lack an explanation of how permissions management works, especiall...
- You thought that copying this rude exchange verbatim was essential to motivate improving the docs?
Matt
10/27/2022
- 06:07 PM Bug #57940 (Duplicate): ceph osd crashes with FAILED ceph_assert(clone_overlap.count(clone)) when...
- Hi, I have this current crash:
I've experienced a disk failure in my ceph cluster.
I've replaced the disk, but no... - 04:50 PM Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
- @Laura, thanks for that! i'll try first with main as you suggested
- 03:32 PM Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
- @Nitzan, here is the branch if you'd like to rebuild it on ci: https://github.com/ljflores/ceph/commits/wip-lflores-t...
- 10:36 AM Bug #57757: ECUtil: terminate called after throwing an instance of 'ceph::buffer::v15_2_0::end_of...
- The coredump from branch wip-lflores-testing, I was not able to create docker image since this branch is no longer av...
- 12:17 PM Bug #55141: thrashers/fastread: assertion failure: rollback_info_trimmed_to == head
- Radoslaw Zarzynski wrote:
> Well, just found a new occurance.
Where can i find it?
- 12:13 PM Bug #50042 (In Progress): rados/test.sh: api_watch_notify failures
- 12:12 PM Bug #52136 (In Progress): Valgrind reports memory "Leak_DefinitelyLost" errors.
- 11:47 AM Bug #57751 (In Progress): LibRadosAio.SimpleWritePP hang and pkill
- 10:55 AM Bug #57751: LibRadosAio.SimpleWritePP hang and pkill
- This is not an issue with the test, not all the osd are up, and we are waiting (valgrind report memory leak from rock...
- 04:26 AM Bug #57937 (Rejected): pg autoscaler of rgw pools doesn't work after creating otp pool
- It's about the following my post to ceph-users ML.
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/threa...
10/26/2022
- 11:25 PM Bug #57017 (Pending Backport): mon-stretched_cluster: degraded stretched mode lead to Monitor crash
- 09:18 PM Bug #52129: LibRadosWatchNotify.AioWatchDelete failed
- /a/yuriw-2022-10-19_18:35:19-rados-wip-yuri10-testing-2022-10-19-0810-distro-default-smithi/7074802
- 02:52 PM Bug #57883 (Resolved): test-erasure-code.sh: TEST_rados_put_get_jerasure fails on "rados_put_get:...
- 01:45 PM Bug #50042: rados/test.sh: api_watch_notify failures
- ...
- 04:58 AM Bug #50042: rados/test.sh: api_watch_notify failures
- I checked all the list_watchers failures (checking size of watch list), It looks like the watcher timed out and that ...
- 06:09 AM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
- I was able to gather a coredump and set up a binary compatible environment to debug it from this run Laura started in...
- 04:58 AM Bug #49689: osd/PeeringState.cc: ceph_abort_msg("past_interval start interval mismatch") start
- I wrote up an working explanation of PastIntervals in https://github.com/athanatos/ceph/tree/sjust/wip-49689-past-int...
- 12:07 AM Bug #57845 (New): MOSDRepOp::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_O...
- Notes from rados team meeting:
Seems like the same class of bugs we hit in https://tracker.ceph.com/issues/52657 a...
10/25/2022
- 11:14 PM Bug #51729: Upmap verification fails for multi-level crush rule
- I put together the following contrived example to
illustrate the problem. Again, this is pacific 16.2.9 on rocky8 li... - 05:19 PM Bug #50219 (New): qa/standalone/erasure-code/test-erasure-eio.sh fails since pg is not in recover...
- The failure actually reproduced here:
/a/lflores-2022-10-17_18:19:55-rados:standalone-main-distro-default-smithi/7... - 05:06 PM Bug #57883 (Fix Under Review): test-erasure-code.sh: TEST_rados_put_get_jerasure fails on "rados_...
- 02:21 PM Bug #57883 (In Progress): test-erasure-code.sh: TEST_rados_put_get_jerasure fails on "rados_put_g...
- 02:19 PM Bug #57900 (In Progress): mon/crush_ops.sh: mons out of quorum
- 02:17 PM Bug #57900: mon/crush_ops.sh: mons out of quorum
- @Radek so the suggestion is to give the mons more time to reboot?
This is the workunit:
https://github.com/ceph/c...
10/24/2022
- 06:18 PM Bug #57852: osd: unhealthy osd cannot be marked down in time
- Not a something we introduced recently but still worth taking a look if nothing urgent is not the plate.
- 06:17 PM Bug #57852 (New): osd: unhealthy osd cannot be marked down in time
- For the detailed explanation!
- 06:10 PM Bug #57845: MOSDRepOp::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_OCTOPUS...
- Just before the crash time-outs were seen:...
- 06:05 PM Bug #57915: LibRadosWatchNotify.AioNotify - error callback ceph_assert(ref > 0)
- Yes, this is one of the Notify bugs that i hit during my tests
- 05:14 PM Bug #57915: LibRadosWatchNotify.AioNotify - error callback ceph_assert(ref > 0)
- Nitzan, I recall you mentioned about some watch-related tests on today's stand-up. Is this one of them?
- 05:57 PM Bug #57546: rados/thrash-erasure-code: wait_for_recovery timeout due to "active+clean+remapped+la...
- As this is about EC: can be acting's items duplicated?
- 05:55 PM Bug #57546: rados/thrash-erasure-code: wait_for_recovery timeout due to "active+clean+remapped+la...
- If https://github.com/ceph/ceph/pull/47901/commits/0d07b406dc2f854363f7ae9b970e980400f4f03e is the actual culprit, th...
- 05:42 PM Bug #57883: test-erasure-code.sh: TEST_rados_put_get_jerasure fails on "rados_put_get: grep '\<5...
- It looks we asked for taking osd.5 down, got a confirmation the command was handled by mon and then @get_osd@ said %5...
- 05:25 PM Bug #57900: mon/crush_ops.sh: mons out of quorum
- Just **suggestion** from the bug scrub: this is a mon thrashing test. None of mon loga seems to have a trace of crash...
- 05:18 PM Bug #55141: thrashers/fastread: assertion failure: rollback_info_trimmed_to == head
- Well, just found a new occurance.
- 05:11 PM Bug #55141: thrashers/fastread: assertion failure: rollback_info_trimmed_to == head
- Lowering the priority as we haven't seen a reoccurence last time.
- 05:17 PM Bug #57913 (Duplicate): Thrashosd: timeout 120 ceph --cluster ceph osd pool rm unique_pool_2 uniq...
- In the teuthology log:...
- 05:10 PM Bug #57529 (Fix Under Review): mclock backfill is getting higher priority than WPQ
- 04:06 AM Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
- Laura Flores wrote:
> Notes from the rados suite review:
>
> We may need to check if we're shutting down while se...
10/23/2022
- 11:45 AM Bug #57915 (New): LibRadosWatchNotify.AioNotify - error callback ceph_assert(ref > 0)
- /a//nmordech-2022-10-23_05:26:13-rados:verify-wip-nm-51282-distro-default-smithi/7077932...
- 05:19 AM Bug #57699: slow osd boot with valgrind (reached maximum tries (50) after waiting for 300 seconds)
- Sridher, yes, those trackers look the same, valgrind make the osd start slower, maybe that's the reason we are seeing...
10/21/2022
- 04:19 PM Bug #55809: "Leak_IndirectlyLost" valgrind report on mon.c
- /a/yuriw-2022-10-12_16:24:50-rados-wip-yuri8-testing-2022-10-12-0718-quincy-distro-default-smithi/7063948/
- 04:16 PM Bug #57913 (Duplicate): Thrashosd: timeout 120 ceph --cluster ceph osd pool rm unique_pool_2 uniq...
- /a/yuriw-2022-10-12_16:24:50-rados-wip-yuri8-testing-2022-10-12-0718-quincy-distro-default-smithi/7063868/
rados/t... - 08:41 AM Bug #57699: slow osd boot with valgrind (reached maximum tries (50) after waiting for 300 seconds)
- @Nitzan Mordechai this is probably similar to,
https://tracker.ceph.com/issues/52948 and https://tracker.ceph.com/is... - 07:47 AM Fix #57040 (Resolved): osd: Update osd's IOPS capacity using async Context completion instead of ...
- 07:46 AM Backport #57443 (Resolved): quincy: osd: Update osd's IOPS capacity using async Context completio...
10/20/2022
- 11:33 PM Bug #56101: Gibba Cluster: 17.2.0 to 17.2.1 RC upgrade OSD crash in function safe_timer
- Notes from the rados suite review:
We may need to check if we're shutting down while sending pg stats; if so, we d... - 03:07 PM Bug #57152 (Resolved): segfault in librados via libcephsqlite
- 03:06 PM Backport #57373 (Resolved): pacific: segfault in librados via libcephsqlite
- 02:56 PM Backport #57373: pacific: segfault in librados via libcephsqlite
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/48187
merged
10/19/2022
- 09:21 PM Backport #52747 (In Progress): pacific: MON_DOWN during mon_join process
- 09:09 PM Backport #52746 (Rejected): octopus: MON_DOWN during mon_join process
- Octopus is EOL.
- 08:59 PM Bug #43584: MON_DOWN during mon_join process
- /a/yuriw-2022-10-05_20:44:57-rados-wip-yuri4-testing-2022-10-05-0917-pacific-distro-default-smithi/7055594
- 08:46 PM Bug #57900 (In Progress): mon/crush_ops.sh: mons out of quorum
- /a/teuthology-2022-10-09_07:01:03-rados-quincy-distro-default-smithi/7059463...
- 03:20 PM Bug #57698 (Pending Backport): osd/scrub: "scrub a chunk" requests are sent to the wrong set of r...
- 10:29 AM Bug #57699: slow osd boot with valgrind (reached maximum tries (50) after waiting for 300 seconds)
- The issue is that we having deadlock on specific condition. When we are trying to update the mClockScheduler config c...
- 05:31 AM Bug #57546: rados/thrash-erasure-code: wait_for_recovery timeout due to "active+clean+remapped+la...
- I was able to reproduce this using the test Laura mentioned above - http://pulpito.front.sepia.ceph.com/amathuri-2022...
10/18/2022
- 04:31 PM Bug #51729: Upmap verification fails for multi-level crush rule
- Chris, can you please provide your osdmap binary?
- 09:03 AM Bug #57845: MOSDRepOp::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_OCTOPUS...
- Hi Neha,
the logs from the crash instance that I reported initially are already rotated out on the particular node... - 02:48 AM Bug #57852: osd: unhealthy osd cannot be marked down in time
- Radoslaw Zarzynski wrote:
> Could you please clarify a bit? Do you mean there some extra, unnecessary (from the POV ...
Also available in: Atom