Activity
From 02/19/2020 to 03/19/2020
03/19/2020
- 10:43 PM Backport #44689 (Resolved): nautilus: osd/osd-scrub-repair.sh fails: scrub/osd-scrub-repair.sh:69...
- https://github.com/ceph/ceph/pull/35048
- 10:43 PM Backport #44686 (Resolved): nautilus: osd/osd-backfill-stats.sh TEST_backfill_out2: wait_for_clea...
- https://github.com/ceph/ceph/pull/35047
- 10:43 PM Backport #44685 (Resolved): octopus: osd/osd-backfill-stats.sh TEST_backfill_out2: wait_for_clean...
- https://github.com/ceph/ceph/pull/34806
- 09:27 PM Bug #44684: pgs entering premerge state that still need backfill
- it's the mgr's fault.....
- 09:02 PM Bug #44684 (Resolved): pgs entering premerge state that still need backfill
- ...
- 08:59 PM Bug #43807: osd-backfill-recovery-log.sh fails
- /a/sage-2020-03-17_13:59:54-rados-wip-sage-testing-2020-03-17-0740-distro-basic-smithi/4863239
Comparing a failed ... - 07:19 PM Bug #43861 (Resolved): ceph_test_rados_watch_notify hang
- 04:44 PM Bug #43914 (Resolved): nautilus: ceph tell command times out
- 03:14 PM Bug #44662 (Fix Under Review): qa/standalone/osd/osd-markdown.sh: markdown_N_impl fails in TEST_m...
- 08:34 AM Bug #44662: qa/standalone/osd/osd-markdown.sh: markdown_N_impl fails in TEST_markdown_boot
- aah, this looks like having the same root cause as https://tracker.ceph.com/issues/44518.
On the monitor side:
... - 01:02 PM Bug #44518 (Pending Backport): osd/osd-backfill-stats.sh TEST_backfill_out2: wait_for_clean timeout
- 03:51 AM Bug #44518 (Fix Under Review): osd/osd-backfill-stats.sh TEST_backfill_out2: wait_for_clean timeout
03/18/2020
- 10:33 PM Bug #44518 (In Progress): osd/osd-backfill-stats.sh TEST_backfill_out2: wait_for_clean timeout
- 09:30 PM Bug #43861 (Fix Under Review): ceph_test_rados_watch_notify hang
- 09:16 PM Bug #43861: ceph_test_rados_watch_notify hang
- i think we should consider jsut dropping this test. it's old test code written by colin almost 10 years ago and i'm ...
- 09:14 PM Bug #43861: ceph_test_rados_watch_notify hang
- another one line comment 5 above...
- 07:31 PM Bug #44439 (Pending Backport): osd/osd-scrub-repair.sh fails: scrub/osd-scrub-repair.sh:698: TEST...
- 05:13 PM Bug #44062 (Resolved): LibRadosWatchNotify.WatchNotify failure
- 02:16 AM Bug #44062 (In Progress): LibRadosWatchNotify.WatchNotify failure
- 02:15 AM Bug #44062: LibRadosWatchNotify.WatchNotify failure
- https://github.com/ceph/ceph/pull/34011
- 05:13 PM Bug #44582 (Resolved): LibRadosMisc.ShutdownRace
- 02:32 PM Feature #44025 (Resolved): Make it harder to set pool replica size to 1
- Based on https://github.com/rook/rook/pull/5023#issuecomment-600344198
- 02:23 PM Feature #44025 (Pending Backport): Make it harder to set pool replica size to 1
- 08:15 AM Bug #44184: Slow / Hanging Ops after pool creation
- Andrew Mitroshin wrote:
> Could you please submit output for the command
>
> [...]
% ceph osd dump | grep req... - 08:09 AM Bug #44184: Slow / Hanging Ops after pool creation
- Dan van der Ster wrote:
> See https://tracker.ceph.com/issues/37875
>
> With `ceph pg dump -f json | jq .osd_epoc... - 07:38 AM Bug #43975 (Resolved): Slow Requests/OP's types not getting logged
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 07:36 AM Backport #44413 (Resolved): nautilus: FTBFS on s390x in openSUSE Build Service due to presence of...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/33716
m... - 07:35 AM Backport #44259 (Resolved): nautilus: Slow Requests/OP's types not getting logged
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/33503
m... - 02:37 AM Bug #44658 (Fix Under Review): seastar is busting unit tests
03/17/2020
- 07:51 PM Bug #44352: pool listings are slow after deleting objects
- For other pools, it takes just a fraction of time to list objects with ...
- 07:44 PM Bug #44352: pool listings are slow after deleting objects
- Abhishek Lekshmanan wrote:
> around 20s were taken to just list contents of the pool which is what happened in the d... - 07:21 PM Bug #44662 (Resolved): qa/standalone/osd/osd-markdown.sh: markdown_N_impl fails in TEST_markdown_...
- ...
- 06:21 PM Backport #44413: nautilus: FTBFS on s390x in openSUSE Build Service due to presence of -O2 in RPM...
- Kefu Chai wrote:
> https://github.com/ceph/ceph/pull/33716
merged - 06:16 PM Backport #44259: nautilus: Slow Requests/OP's types not getting logged
- Sridhar Seshasayee wrote:
> https://github.com/ceph/ceph/pull/33503
merged - 06:08 PM Bug #43888: osd/osd-bench.sh 'tell osd.N bench' hang
- /a/nojha-2020-03-16_17:35:35-rados:standalone-master-distro-basic-smithi/4860664/
- 05:57 PM Bug #43807 (New): osd-backfill-recovery-log.sh fails
- Note that this is a resurrection of the same failure with different symptoms
/a/sage-2020-03-17_13:59:54-rados-wip... - 05:21 PM Bug #43807: osd-backfill-recovery-log.sh fails
- /a/sage-2020-03-17_13:59:54-rados-wip-sage-testing-2020-03-17-0740-distro-basic-smithi/4863239
- 05:04 PM Bug #44311: crash in Objecter and CRUSH map lookup
- Conversation from IRC:
<neha> mahatic: I would like to know Aemerson's thoughts on https://tracker.ceph.com/issues... - 04:11 PM Bug #44582: LibRadosMisc.ShutdownRace
- 110 is ETIMEDOUT...
- 03:54 PM Bug #44582: LibRadosMisc.ShutdownRace
- /a/sage-2020-03-17_14:06:37-rados:verify-wip-sage-testing-2020-03-16-2107-distro-basic-smithi/4863370
- 04:07 PM Bug #44658 (Resolved): seastar is busting unit tests
- run-make-check.sh does not pass any more in seastar tests. I have tried on my old rex box and one of the new vossi ma...
- 04:04 PM Bug #44453 (Resolved): mon: fix/improve mon sync over small keys
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 03:15 PM Bug #44643 (Can't reproduce): leaked buffer (alloc from MonClient::handle_auth_request)
- ...
- 08:27 AM Bug #37875: osdmaps aren't being cleaned up automatically on healthy cluster
- Hi Greg, are you sure this ticket is a duplicate? In my case, the cluster is healthy with no DOWN OSDs, so it shouldn...
- 01:46 AM Bug #44507 (Fix Under Review): osd/PeeringState.cc: 5582: FAILED ceph_assert(ps->is_acting(osd_wi...
03/16/2020
- 10:40 PM Bug #44022 (Resolved): mimic: Receiving MLogRec in Started/Primary/Peering/GetInfo causes an osd ...
- 10:33 PM Backport #44464 (Resolved): nautilus: mon: fix/improve mon sync over small keys
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/33765
m... - 10:25 PM Bug #44631: ceph pg dump error code 124
- 124 -> process was killed by SIGTERM (according to https://www.howtogeek.com/423286/how-to-use-the-timeout-command-on...
- 10:24 PM Bug #44631 (New): ceph pg dump error code 124
- ...
- 10:17 PM Bug #21592: LibRadosCWriteOps.CmpExt got 0 instead of -4095-1
- ...
- 02:52 PM Bug #44184: Slow / Hanging Ops after pool creation
- Another customer of ours has reported this behaviour. This cluster was, most likely, installed with Hammer and is now...
- 03:58 AM Bug #44062 (Resolved): LibRadosWatchNotify.WatchNotify failure
03/14/2020
- 03:35 PM Bug #44022: mimic: Receiving MLogRec in Started/Primary/Peering/GetInfo causes an osd crash
- https://github.com/ceph/ceph/pull/33594 merged
03/13/2020
- 11:23 PM Bug #44518: osd/osd-backfill-stats.sh TEST_backfill_out2: wait_for_clean timeout
- /a/sage-2020-03-09_01:44:37-rados:standalone-wip-sage2-testing-2020-03-08-1456-distro-basic-smithi/4838500
queue_r... - 07:45 PM Bug #44518: osd/osd-backfill-stats.sh TEST_backfill_out2: wait_for_clean timeout
- Comparing a passed test with a failed one:
PASSED - note the PG mapping [2,4,3]/[1,0] backfill=[2,3,4]... - 08:51 PM Bug #44566 (Resolved): ceph tell segv: librados fini vs protocolv2
- 12:26 PM Bug #44566: ceph tell segv: librados fini vs protocolv2
- for the record, all threads in the second instance:...
- 12:25 PM Bug #44566 (Fix Under Review): ceph tell segv: librados fini vs protocolv2
- 12:22 PM Bug #44566: ceph tell segv: librados fini vs protocolv2
- I think this is related to the fix for #44526, https://github.com/ceph/ceph/pull/33825, which skips rados shutdown.
... - 12:20 PM Bug #44566: ceph tell segv: librados fini vs protocolv2
- again...
- 04:11 PM Bug #44184: Slow / Hanging Ops after pool creation
- Could you please submit output for the command...
- 12:57 PM Bug #44184: Slow / Hanging Ops after pool creation
- Jan Fajerski wrote:
> Can confirm seeing issues with osd map pruning ("oldest_map": 41985 vs "newest_map": 83376) an... - 12:53 PM Bug #44184: Slow / Hanging Ops after pool creation
- Can confirm seeing issues with osd map pruning ("oldest_map": 41985 vs "newest_map": 83376) and large osd_map_cache_m...
- 12:12 PM Bug #44595 (New): cache tiering: Error: oid 48 copy_from 493 returned error code -2
- ...
03/12/2020
- 08:59 PM Bug #44586 (New): Deleting a pool w/ in-flight ops might crash client osdc
- The rbd-mirror test cases conclude the test by deleting the pools just to ensure the daemons survive. It appears that...
- 05:32 PM Bug #44243: memstore make check test fails
- here is a similar failure from ceph_test_objectstore:...
- 03:14 PM Bug #44582 (Resolved): LibRadosMisc.ShutdownRace
- ...
- 02:32 PM Bug #44352: pool listings are slow after deleting objects
- around 20s were taken to just list contents of the pool which is what happened in the debug logs, what time is taken ...
- 02:16 PM Bug #44184: Slow / Hanging Ops after pool creation
- So this message came along on the users mailinglist:...
03/11/2020
- 05:27 PM Bug #44566 (Resolved): ceph tell segv: librados fini vs protocolv2
- ...
- 02:35 PM Bug #44184: Slow / Hanging Ops after pool creation
- Forgot to mention: the cluster has been running since Jewel 10.2.4. I think upgrading from Jewel or older seems to be...
- 02:31 PM Bug #44184: Slow / Hanging Ops after pool creation
- Oh, running mostly 14.2.4 and some OSDs on 14.2.5 on Ubuntu 16.04. Small cluster with only 127 OSDs.
- 02:29 PM Bug #44184: Slow / Hanging Ops after pool creation
- I've encountered another cluster in the wild that hit this bug. It seems to be triggered somewhat reliable a few hour...
- 02:38 AM Bug #44062 (In Progress): LibRadosWatchNotify.WatchNotify failure
- 02:02 AM Bug #44062: LibRadosWatchNotify.WatchNotify failure
- i think https://github.com/ceph/ceph/pull/33871 may help...
- 12:43 AM Bug #44062: LibRadosWatchNotify.WatchNotify failure
- /a/sage-2020-03-10_16:51:17-rados-wip-sage3-testing-2020-03-10-1037-distro-basic-smithi/4844006
I think this is fa...
03/10/2020
- 10:20 PM Bug #44373 (Resolved): objecter: invalid read
- 10:03 PM Bug #38357: ClsLock.TestExclusiveEphemeralStealEphemeral failed
- ...
- 10:01 PM Bug #44518: osd/osd-backfill-stats.sh TEST_backfill_out2: wait_for_clean timeout
- slightly different versoin of this:...
- 10:00 PM Bug #44062: LibRadosWatchNotify.WatchNotify failure
- /a/sage-2020-03-10_16:51:17-rados-wip-sage3-testing-2020-03-10-1037-distro-basic-smithi/4844127
- 06:36 PM Bug #43861: ceph_test_rados_watch_notify hang
- ...
- 04:48 PM Backport #44464 (In Progress): nautilus: mon: fix/improve mon sync over small keys
- 04:47 PM Backport #44464 (Resolved): nautilus: mon: fix/improve mon sync over small keys
- 03:15 PM Bug #44184: Slow / Hanging Ops after pool creation
- Hi everyone, we were able to collect some debug logs that seems to exhibit this case. the log is fairly large, you ca...
- 01:23 PM Bug #44420: cephadm cluster: "ceph ping mon.*" works fine, but "ceph ping mon.<id>" is broken
- might be a cephadm issue.
- 06:30 AM Bug #44536 (New): Segmentation fault when it sets ENABLE_COVERAGE:BOOL=ON
- 1、It sets option(ENABLE_COVERAGE "Coverage is enabled" ON) in CMakeLists.txt
2、
root@dev:/data/liugangbiao/zy/cep... - 12:12 AM Bug #44362: osd: uninitialized memory in sendmsg
- @Yehuda: what's the status of this ticket?
Are you able to replicate the issue locally or it happens solely at sepia?
03/09/2020
- 11:57 PM Bug #44532 (Resolved): nautilus: FAILED ceph_assert(head.version == 0 || e.version.version > head...
- Run: http://pulpito.ceph.com/yuriw-2020-03-07_18:26:25-rados-wip-yuri8-testing-2020-03-06-2005-nautilus-distro-basic-...
- 08:28 PM Bug #43865 (Resolved): osd-scrub-test.sh fails date check
- 07:18 PM Bug #44427: osd: stuck during shutdown
- /a/sage-2020-03-09_14:07:51-rados-wip-sage4-testing-2020-03-09-0634-distro-basic-smithi/4841228
- 07:06 PM Bug #39039 (Need More Info): mon connection reset, command not resent
- can't reproduce :(
- 07:06 PM Bug #44517: osd/osd-backfill-space.sh TEST_backfill_multi_partial: pgs didn't go active+clean
- see http://pulpito.ceph.com/sage-2020-03-09_01:44:37-rados:standalone-wip-sage2-testing-2020-03-08-1456-distro-basic-...
- 12:12 PM Bug #44517 (New): osd/osd-backfill-space.sh TEST_backfill_multi_partial: pgs didn't go active+clean
- ...
- 07:06 PM Bug #44518: osd/osd-backfill-stats.sh TEST_backfill_out2: wait_for_clean timeout
- see http://pulpito.ceph.com/sage-2020-03-09_01:44:37-rados:standalone-wip-sage2-testing-2020-03-08-1456-distro-basic-...
- 12:14 PM Bug #44518 (Resolved): osd/osd-backfill-stats.sh TEST_backfill_out2: wait_for_clean timeout
- ...
- 06:42 AM Bug #43382: medium io/system load causes quorum failure
- couldnt reproduce, close
- 06:41 AM Bug #43185: ceph -s not showing client activity
- fixed with 14.2.8
- 03:06 AM Bug #44439 (Resolved): osd/osd-scrub-repair.sh fails: scrub/osd-scrub-repair.sh:698: TEST_repair_...
03/08/2020
- 11:49 PM Bug #44510 (New): osd/osd-recovery-space.sh TEST_recovery_test_simple failure
- ...
- 07:53 PM Bug #43865 (Fix Under Review): osd-scrub-test.sh fails date check
- 04:30 PM Bug #39039: mon connection reset, command not resent
- trying to reproduce here:
http://pulpito.ceph.com/sage-39039/
http://pulpito.ceph.com/sage-39039-lessdebug/ - 04:28 PM Bug #44229 (Can't reproduce): monclient: _check_auth_rotating possible clock skew, rotating keys ...
- 02:13 AM Bug #44507 (Resolved): osd/PeeringState.cc: 5582: FAILED ceph_assert(ps->is_acting(osd_with_shard...
- ...
03/07/2020
- 07:54 PM Bug #39039 (In Progress): mon connection reset, command not resent
- /a/sage-2020-03-07_14:00:00-rados-master-distro-basic-smithi/4834734
- 02:21 PM Bug #44454 (Resolved): expected valgrind issues and found none
- 01:57 PM Bug #44454 (In Progress): expected valgrind issues and found none
- 02:13 PM Bug #44362: osd: uninitialized memory in sendmsg
- hmm, seeing this now on master, after the existing whitelist was updated to the new symbols in 31a7a461382a3a979c12e1...
- 04:44 AM Bug #44362: osd: uninitialized memory in sendmsg
- @sage I think we can close it. It seems that my research tracks @rzarzynski's, so I'll take his original conclusions.
- 12:58 PM Bug #43861: ceph_test_rados_watch_notify hang
- no output at all, like comment 2 above:
/a/sage-2020-03-06_17:29:42-rados-wip-sage4-testing-2020-03-05-1645-distro-b... - 08:09 AM Bug #43185: ceph -s not showing client activity
- new dump...after disabling almost all mgr modules
03/06/2020
- 11:39 PM Bug #43865: osd-scrub-test.sh fails date check
- reproducing this here: http://pulpito.ceph.com/sage-2020-03-06_22:05:09-rados:standalone-wip-sage4-testing-2020-03-05...
- 10:52 PM Bug #43862 (Can't reproduce): mkfs fsck found fatal error: (2) No such file or directory during c...
- 06:15 PM Feature #43377: Make Zstandard compression level a configurable option
- *PR*: https://github.com/ceph/ceph/pull/33790
- 05:39 PM Bug #44362: osd: uninitialized memory in sendmsg
- Merged https://github.com/ceph/ceph/pull/33757 ... should we keep this open or close it?
- 03:01 PM Bug #43882 (Need More Info): osd to mon connection lost, osd stuck down
- i thought i reproduced this, but it was a bug in another PR i was testing.
- 02:40 PM Bug #43882 (In Progress): osd to mon connection lost, osd stuck down
- 12:28 PM Bug #43150 (Resolved): osd-scrub-snaps.sh fails
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 12:25 PM Backport #44070 (Resolved): luminous: Add builtin functionality in ceph-kvstore-tool to repair co...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/33195
m... - 12:05 PM Backport #43852 (Resolved): nautilus: osd-scrub-snaps.sh fails
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/33274
m... - 10:37 AM Backport #44490 (Resolved): nautilus: lz4 compressor corrupts data when buffers are unaligned
- https://github.com/ceph/ceph/pull/35004
- 10:36 AM Backport #44489 (Rejected): mimic: lz4 compressor corrupts data when buffers are unaligned
- https://github.com/ceph/ceph/pull/35054
- 10:33 AM Backport #44486 (Resolved): nautilus: Nautilus: Random mon crashes in failed assertion at ceph::t...
- https://github.com/ceph/ceph/pull/34542
- 10:30 AM Backport #44468 (Resolved): nautilus: mon: Get session_map_lock before remove_session
- https://github.com/ceph/ceph/pull/34677
- 10:30 AM Backport #44467 (Rejected): mimic: mon: Get session_map_lock before remove_session
- 10:30 AM Backport #44464 (Resolved): nautilus: mon: fix/improve mon sync over small keys
- https://github.com/ceph/ceph/pull/33765
- 07:03 AM Bug #44454 (Resolved): expected valgrind issues and found none
- 03:31 AM Bug #44454 (In Progress): expected valgrind issues and found none
- running with suite-repo pointing to the commit just *before* the py3 task merge faf701d33aeb6e1657c969a41223b37a6972b...
- 04:34 AM Bug #44439 (Fix Under Review): osd/osd-scrub-repair.sh fails: scrub/osd-scrub-repair.sh:698: TEST...
- 02:47 AM Bug #44439: osd/osd-scrub-repair.sh fails: scrub/osd-scrub-repair.sh:698: TEST_repair_stats_ec: ...
- This did reproduce after multiple runs. I added a flush_pg_stats and run it many times without seeing the failure.
- 03:37 AM Bug #44373 (Fix Under Review): objecter: invalid read
- Fix at https://github.com/ceph/ceph/pull/33771
03/05/2020
- 10:13 PM Bug #44454 (Resolved): expected valgrind issues and found none
- http://pulpito.ceph.com/sage-2020-03-05_19:46:30-rados:valgrind-leaks-wip-sage4-testing-2020-03-05-0754-distro-basic-...
- 09:06 PM Bug #44439: osd/osd-scrub-repair.sh fails: scrub/osd-scrub-repair.sh:698: TEST_repair_stats_ec: ...
- hmm, does not reproduce locally for me.
- 01:20 PM Bug #44439 (Resolved): osd/osd-scrub-repair.sh fails: scrub/osd-scrub-repair.sh:698: TEST_repair_...
- ...
- 08:19 PM Bug #44453: mon: fix/improve mon sync over small keys
- Nautilus backport: https://github.com/ceph/ceph/pull/33765
- 07:18 PM Bug #44453 (Resolved): mon: fix/improve mon sync over small keys
- Background: [ceph-users] Can't add a ceph-mon to existing large cluster
- 07:32 PM Bug #42830: problem returning mon to cluster
- Workaround in our case is: `ceph config set mon mon_sync_max_payload_size 4096`
We have 5 mons again! - 02:35 PM Bug #42830: problem returning mon to cluster
- I also posted this on the mailinglist, but let me post it here as well:...
- 04:45 PM Backport #44070: luminous: Add builtin functionality in ceph-kvstore-tool to repair corrupted key...
> https://github.com/ceph/ceph/pull/33195
merged- 02:02 PM Bug #44385 (Resolved): ClsHello.WriteReturnData failure
- 01:19 PM Bug #41923 (Can't reproduce): 3 different ceph-osd asserts caused by enabling auto-scaler
03/04/2020
- 10:25 PM Bug #44311 (New): crash in Objecter and CRUSH map lookup
- 01:42 PM Bug #44311: crash in Objecter and CRUSH map lookup
- Scratch that. If you replace qa/workunits/rbd/read-flags.sh with this script https://gist.github.com/MahatiC/a4bf4310...
- 01:35 PM Bug #44311: crash in Objecter and CRUSH map lookup
- Neha Ojha wrote:
> Is this something that started appearing recently? Do you have a commit or version that works for... - 10:11 PM Bug #44400: Marking OSD out causes primary-affinity 0 to be ignored when up_set has no common OSD...
- This is worth investigating, currently nothing in the choose_acting() function looks at primary-affinity.
- 10:07 PM Bug #44348 (Resolved): thrasher can trigger osd shutdown
- 09:40 PM Bug #44427 (New): osd: stuck during shutdown
- ...
- 08:55 PM Bug #44362: osd: uninitialized memory in sendmsg
- The hole represented by @filler@ is supposed to carry two things:
* zero-byte long ciphertext's fragment acquired fr... - 07:58 PM Bug #37656 (Triaged): FileStore::_do_transaction() crashed with error 17 (merge collection vs osd...
- 07:56 PM Bug #37656: FileStore::_do_transaction() crashed with error 17 (merge collection vs osd restart)
- the merge happens right before we shut down:...
- 07:40 PM Bug #37656: FileStore::_do_transaction() crashed with error 17 (merge collection vs osd restart)
- ...
- 02:30 PM Bug #44420 (Fix Under Review): cephadm cluster: "ceph ping mon.*" works fine, but "ceph ping mon....
- $SUBJ says it all, almost - The error is:...
- 12:52 PM Bug #43365 (Pending Backport): Nautilus: Random mon crashes in failed assertion at ceph::time_det...
- 05:44 AM Bug #43365: Nautilus: Random mon crashes in failed assertion at ceph::time_detail::signedspan
- ...
- 12:43 PM Bug #44407 (Pending Backport): mon: Get session_map_lock before remove_session
- 10:33 AM Bug #44407 (Fix Under Review): mon: Get session_map_lock before remove_session
- 06:08 AM Bug #44407 (Resolved): mon: Get session_map_lock before remove_session
- We should protect session_map with session_map_lock.
- 10:59 AM Backport #44413 (In Progress): nautilus: FTBFS on s390x in openSUSE Build Service due to presence...
- 10:58 AM Backport #44413 (Resolved): nautilus: FTBFS on s390x in openSUSE Build Service due to presence of...
- https://github.com/ceph/ceph/pull/33716
- 04:45 AM Bug #39525 (Pending Backport): lz4 compressor corrupts data when buffers are unaligned
- 12:16 AM Bug #44385 (Fix Under Review): ClsHello.WriteReturnData failure
- reproduced locally by making the test loop and setting ms_inject_socket_failures=500 on the osd. confirmed this fixe...
- 12:00 AM Bug #44385 (In Progress): ClsHello.WriteReturnData failure
03/03/2020
- 10:13 PM Bug #44362 (In Progress): osd: uninitialized memory in sendmsg
- 03:15 AM Bug #44362: osd: uninitialized memory in sendmsg
- It seems to me that the specific commit just exposed an existing issue that for some reason did't show up before (lik...
- 07:23 PM Bug #44400 (Won't Fix): Marking OSD out causes primary-affinity 0 to be ignored when up_set has n...
- Process:
Set primary-affinity 0 on osd.0
Watch 'ceph osd ls-by-primary osd.0' until it has 0 PGs listed.
Mark os... - 07:12 PM Bug #43150: osd-scrub-snaps.sh fails
- https://github.com/ceph/ceph/pull/33274 merged
- 04:09 PM Bug #43365 (Fix Under Review): Nautilus: Random mon crashes in failed assertion at ceph::time_det...
- 12:58 PM Bug #43365: Nautilus: Random mon crashes in failed assertion at ceph::time_detail::signedspan
- Hi,
same behaviour for us: one of the 3 mons crashes randomly, nearly once per day.
We are using Ceph 14.2.6 PVE ... - 04:09 PM Bug #44311: crash in Objecter and CRUSH map lookup
- Is this something that started appearing recently? Do you have a commit or version that works for this same command? ...
- 02:48 PM Bug #44184: Slow / Hanging Ops after pool creation
- We've got similar case with a plenty of slow op indications many of them are osd_op_create ones.
Which eventually g... - 12:34 AM Bug #44388 (New): osd: valgrind: Invalid read of size 8
- ...
03/02/2020
- 10:29 PM Bug #44362: osd: uninitialized memory in sendmsg
- the takeaway from http://pulpito.ceph.com/sage-2020-03-02_17:19:00-rados:verify-master-distro-basic-smithi/ is that t...
- 09:08 PM Bug #44362: osd: uninitialized memory in sendmsg
- The regression is between these commits: d27f512d1731988cf7f369559f2fc324f1592047..7b0e18c09eb6060ee23f00c06dac4203a2...
- 08:39 PM Bug #44385 (Resolved): ClsHello.WriteReturnData failure
- ...
- 06:57 PM Bug #44311: crash in Objecter and CRUSH map lookup
- To give more context, this issue is blocking progress on rbd op threads config change -> https://github.com/ceph/ceph...
- 06:04 PM Bug #44358 (Resolved): messenger addr nonces aren't unique with cephadm
- 02:06 PM Bug #44373 (Resolved): objecter: invalid read
- ...
- 12:36 PM Backport #44370 (Resolved): nautilus: msg/async: the event center is blocked by rdma construct co...
- https://github.com/ceph/ceph/pull/34780
- 12:36 PM Backport #44369 (Rejected): mimic: msg/async: the event center is blocked by rdma construct conec...
- 12:36 PM Backport #44368 (Rejected): mimic: Rados should use the '-o outfile' convention
03/01/2020
- 11:00 PM Bug #44362 (Can't reproduce): osd: uninitialized memory in sendmsg
- ...
- 10:55 PM Bug #44358 (Fix Under Review): messenger addr nonces aren't unique with cephadm
- 07:47 AM Bug #42452 (Pending Backport): msg/async: the event center is blocked by rdma construct conection...
- 04:18 AM Backport #44360 (In Progress): nautilus: Rados should use the '-o outfile' convention
- 04:18 AM Backport #44360: nautilus: Rados should use the '-o outfile' convention
- https://github.com/ceph/ceph/pull/33641
- 04:17 AM Backport #44360 (Resolved): nautilus: Rados should use the '-o outfile' convention
- https://github.com/ceph/ceph/pull/33641
- 04:08 AM Bug #42477 (Pending Backport): Rados should use the '-o outfile' convention
- we have to backport this change, otherwise we have ...
02/29/2020
- 06:24 AM Bug #43185: ceph -s not showing client activity
- ...
- 06:19 AM Bug #43185: ceph -s not showing client activity
- ...
- 12:18 AM Bug #44314: osd-backfill-stats.sh failing intermittently in TEST_backfill_sizeup_out() (degraded ...
The kick_recovery_wq didn't get backfill restarted on the failed run. Or a recovery attempt (periodic?) was someho...
02/28/2020
- 11:01 PM Bug #44314: osd-backfill-stats.sh failing intermittently in TEST_backfill_sizeup_out() (degraded ...
-
The unset of nobackfill happened after an attempt to start backfill was initiated and it deferred due to the fl... - 09:37 PM Bug #43126 (Resolved): OSD_SLOW_PING_TIME_BACK nits
- 08:30 PM Bug #44358 (Resolved): messenger addr nonces aren't unique with cephadm
- we use the pid for the nonce all over the place, but with cephadm the pid of daemons is always 1.
- 03:45 PM Bug #38069: upgrade:jewel-x-luminous with short_pg_log.yaml fails with assert(s <= can_rollback_to)
- For future references.
If I understand right: it seems that this happens during recovery when pg gets trim command a... - 12:56 PM Bug #44352 (New): pool listings are slow after deleting objects
- I'm seeing a weird problem on a system where the following was done:
* multi-site setup with two zones
* primary ... - 12:14 PM Bug #41016 (Resolved): Improve upmap change reporting in logs
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 12:14 PM Bug #41317 (Resolved): PeeringState::GoClean will call purge_strays unconditionally
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 12:12 PM Bug #42387 (Resolved): ceph_test_admin_socket_output fails in rados qa suite
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 12:12 PM Bug #42501 (Resolved): format error: ceph osd stat --format=json
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 12:11 PM Backport #43992 (Need More Info): nautilus: objecter doesn't send osd_op
- first attempted backport - https://github.com/ceph/ceph/pull/33144 - was closed
marking non-trivial - 12:11 PM Bug #43308 (Resolved): negative num_objects can set PG_STATE_DEGRADED
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 12:11 PM Backport #43991 (Need More Info): mimic: objecter doesn't send osd_op
- first attempted backport - https://github.com/ceph/ceph/pull/33143 - was closed
marking non-trivial - 09:17 AM Bug #44297 (Resolved): mon/Monitor.cc: 3924: FAILED ceph_assert(!"send_message on anonymous conne...
- 09:07 AM Bug #44062: LibRadosWatchNotify.WatchNotify failure
- /a/sage-2020-02-27_05:12:04-rados-wip-sage2-testing-2020-02-26-1925-distro-basic-smithi/4806157
- 09:06 AM Bug #44348 (Fix Under Review): thrasher can trigger osd shutdown
- 08:56 AM Bug #44348 (Resolved): thrasher can trigger osd shutdown
- ...
- 03:42 AM Bug #44296 (Resolved): qa/standalone/mgr/balancer.sh fails due to test error and not waiting for ...
- 02:39 AM Bug #44296 (Fix Under Review): qa/standalone/mgr/balancer.sh fails due to test error and not wait...
- 02:32 AM Bug #44022 (Fix Under Review): mimic: Receiving MLogRec in Started/Primary/Peering/GetInfo causes...
- 02:10 AM Bug #44022 (In Progress): mimic: Receiving MLogRec in Started/Primary/Peering/GetInfo causes an o...
- 02:00 AM Bug #44022: mimic: Receiving MLogRec in Started/Primary/Peering/GetInfo causes an osd crash
- Ah, this is why: 168e20ab8b8da3a5aed41b73f9627d10971be67b...
02/27/2020
- 11:30 PM Bug #44022 (Fix Under Review): mimic: Receiving MLogRec in Started/Primary/Peering/GetInfo causes...
- In any case, https://github.com/ceph/ceph/pull/33590 will just prevent it from crashing.
- 11:22 PM Bug #44022: mimic: Receiving MLogRec in Started/Primary/Peering/GetInfo causes an osd crash
- The part that I don't understand is when osd.6 responded, the epoch_sent/epoch_requested(4115/4100) seem correct
<... - 03:45 AM Bug #44022: mimic: Receiving MLogRec in Started/Primary/Peering/GetInfo causes an osd crash
- On osd.8(mimic)
This is when we request the log from osd.6... - 02:29 AM Bug #44022: mimic: Receiving MLogRec in Started/Primary/Peering/GetInfo causes an osd crash
- ...
- 09:50 PM Support #22749 (Closed): dmClock OP classification
- 09:11 PM Bug #42328 (Resolved): osd/PrimaryLogPG.cc: 3962: ceph_abort_msg("out of order op")
- 06:22 PM Backport #43472 (Resolved): mimic: negative num_objects can set PG_STATE_DEGRADED
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/33331
m... - 06:21 PM Backport #43320 (Resolved): mimic: PeeringState::GoClean will call purge_strays unconditionally
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/33329
m... - 06:20 PM Backport #42998 (Resolved): mimic: acting_recovery_backfill won't catch all up peers
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/33324
m... - 06:20 PM Backport #42852 (Resolved): mimic: format error: ceph osd stat --format=json
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/33322
m... - 06:18 PM Backport #43881 (Resolved): mimic: mon: segv in MonOpRequest::~MonOpRequest OpHistory::cleanup
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/33154
m... - 06:18 PM Backport #43987 (Resolved): mimic: osd: Allow 64-char hostname to be added as the "host" in CRUSH
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/33145
m... - 06:17 PM Backport #43652 (Resolved): mimic: Improve upmap change reporting in logs
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/32717
m... - 06:17 PM Backport #40890 (Resolved): mimic: Pool settings aren't populated to OSD after restart.
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/32125
m... - 06:16 PM Backport #42879 (Resolved): mimic: ceph_test_admin_socket_output fails in rados qa suite
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/33323
m... - 06:16 PM Backport #43630 (Resolved): mimic: segv in collect_sys_info
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/32902
m... - 04:16 PM Bug #39525 (Fix Under Review): lz4 compressor corrupts data when buffers are unaligned
- Thanks to Dan we have a reproducer! I cleaned it up a bit, rebased on master, added a workaround to the LZ4 plugin, ...
- 03:44 PM Bug #39525: lz4 compressor corrupts data when buffers are unaligned
- No, not positive. Very early on we did play around with compression also for the metadata, but in the end decided lat...
- 03:26 PM Bug #39525: lz4 compressor corrupts data when buffers are unaligned
- Erik Lindahl wrote:
> Hi,
>
> Oops; Sorry Dan, but I just realised I misled you. While we do have aggressive LZ4 ... - 10:57 AM Bug #39525: lz4 compressor corrupts data when buffers are unaligned
- Hi,
Oops; Sorry Dan, but I just realised I misled you. While we do have aggressive LZ4 enabled by *default* (in pa... - 10:34 AM Bug #39525: lz4 compressor corrupts data when buffers are unaligned
- I got confirmation from Troy and Erik -- both are using lz4 compression like us.
I'm trying to reproduce using uni... - 12:55 PM Backport #44324 (Resolved): nautilus: Receiving RemoteBackfillReserved in WaitLocalBackfillReserv...
- https://github.com/ceph/ceph/pull/34512
02/26/2020
- 11:57 PM Backport #43472: mimic: negative num_objects can set PG_STATE_DEGRADED
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/33331
merged - 11:56 PM Backport #43320: mimic: PeeringState::GoClean will call purge_strays unconditionally
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/33329
merged - 11:55 PM Backport #42998: mimic: acting_recovery_backfill won't catch all up peers
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/33324
merged - 11:55 PM Backport #42852: mimic: format error: ceph osd stat --format=json
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/33322
merged - 11:54 PM Backport #43881: mimic: mon: segv in MonOpRequest::~MonOpRequest OpHistory::cleanup
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/33154
merged - 11:52 PM Backport #43881: mimic: mon: segv in MonOpRequest::~MonOpRequest OpHistory::cleanup
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/33154
merged - 11:51 PM Backport #43987: mimic: osd: Allow 64-char hostname to be added as the "host" in CRUSH
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/33145
merged - 11:51 PM Backport #43652: mimic: Improve upmap change reporting in logs
- David Zafman wrote:
> https://github.com/ceph/ceph/pull/32717
merged - 11:49 PM Backport #40890: mimic: Pool settings aren't populated to OSD after restart.
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/32125
merged - 11:47 PM Backport #42879: mimic: ceph_test_admin_socket_output fails in rados qa suite
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/33323
merged - 11:44 PM Backport #43630: mimic: segv in collect_sys_info
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/32902
merged - 11:08 PM Bug #44311: crash in Objecter and CRUSH map lookup
- Mahati Chamarthy wrote:
> Neha Ojha wrote:
> > Which version is this on?
>
Current master. To reproduce set rbd_... - 10:13 PM Bug #44311: crash in Objecter and CRUSH map lookup
- Neha Ojha wrote:
> Which version is this on?
Current master - 10:09 PM Bug #44311 (Need More Info): crash in Objecter and CRUSH map lookup
- Which version is this on?
- 05:45 PM Bug #44311 (Resolved): crash in Objecter and CRUSH map lookup
- When Concurrent reads are issued with the below rbd command, it results in failure due to crash in Objecter and CRUSH...
- 09:54 PM Bug #44296 (In Progress): qa/standalone/mgr/balancer.sh fails due to test error and not waiting f...
- 09:50 PM Feature #44107: mon: produce stable election results when netsplits and other errors happen
- Marking anything we need for octopus as "Urgent".
- 08:51 PM Bug #44062: LibRadosWatchNotify.WatchNotify failure
- ...
- 08:48 PM Bug #44314 (Resolved): osd-backfill-stats.sh failing intermittently in TEST_backfill_sizeup_out()...
- ...
- 07:47 PM Bug #43914 (Fix Under Review): nautilus: ceph tell command times out
- 06:48 PM Bug #43914: nautilus: ceph tell command times out
- okay yeah, it's because the command wq uses osd_lock...
- 06:41 PM Bug #43914: nautilus: ceph tell command times out
- so, this was fixed in nautilus, in the sense that https://github.com/ceph/ceph/pull/27696 went into nautilus.
- 06:37 PM Bug #43914: nautilus: ceph tell command times out
- The thread (or lock?) is busy with...
- 05:13 PM Bug #43914: nautilus: ceph tell command times out
- This run has more relavant information: /a/nojha-2020-02-26_03:20:34-upgrade:mimic-x:stress-split-nautilus-distro-bas...
- 07:33 PM Bug #42328: osd/PrimaryLogPG.cc: 3962: ceph_abort_msg("out of order op")
- follow-up fix: https://github.com/ceph/ceph/pull/33559 (typo in original commit)
- 05:25 PM Bug #41183: pg autoscale on EC pools
- Looks like a fix is going in: https://github.com/ceph/ceph/pull/33170
- 04:54 PM Bug #41183: pg autoscale on EC pools
- Seem to have the same issue here.
158 OSDs with 1 main pool, an EC 5+2 pool with a 2048 pg_num, but the autoscaler... - 02:28 PM Bug #43365: Nautilus: Random mon crashes in failed assertion at ceph::time_detail::signedspan
- We're seeing this a couple times a day on debian 10.1, using croit's repo:
kernel 4.19.67-2+deb10u1
ceph version 14... - 10:54 AM Cleanup #44309 (New): auth: remove deprecated 'auid' field from pool metadata
- As per https://github.com/ceph/ceph/pull/23540#issuecomment-413589557, 'auid' field was deprecated but never removed ...
- 12:09 AM Bug #44297 (Fix Under Review): mon/Monitor.cc: 3924: FAILED ceph_assert(!"send_message on anonymo...
- 12:02 AM Bug #44297: mon/Monitor.cc: 3924: FAILED ceph_assert(!"send_message on anonymous connection")
- The command is passed from a nautilus monitor:...
02/25/2020
- 11:51 PM Bug #44275 (Resolved): NameError: name 'retval' is not defined
- 11:50 PM Bug #44248 (Pending Backport): Receiving RemoteBackfillReserved in WaitLocalBackfillReserved can ...
- 03:16 AM Bug #44248 (Fix Under Review): Receiving RemoteBackfillReserved in WaitLocalBackfillReserved can ...
- 02:37 AM Bug #44248: Receiving RemoteBackfillReserved in WaitLocalBackfillReserved can cause the osd to crash
- The problem is that though osd.1 sent a RELEASE to osd.8, we still ended up de-queueing "4184 RemoteBackfillReserved"...
- 12:49 AM Bug #44248: Receiving RemoteBackfillReserved in WaitLocalBackfillReserved can cause the osd to crash
- This is when 4184 RemoteBackfillReserved was enqueued...
- 11:43 PM Bug #44297 (Resolved): mon/Monitor.cc: 3924: FAILED ceph_assert(!"send_message on anonymous conne...
- on nautilus->octopus/master upgrade...
- 11:39 PM Bug #44062: LibRadosWatchNotify.WatchNotify failure
- ...
- 11:32 PM Bug #44296 (Resolved): qa/standalone/mgr/balancer.sh fails due to test error and not waiting for ...
http://pulpito.ceph.com/dzafman-2020-02-08_20:24:49-rados-wip-zafman-testing-distro-basic-smithi/4746333
With 2 ...- 09:36 PM Bug #43914: nautilus: ceph tell command times out
- First observation from teuthology.log for /a/nojha-2020-02-21_20:34:10-upgrade:mimic-x:stress-split-nautilus-distro-b...
- 06:34 PM Bug #38219: rebuild-mondb hangs
- Seen in nautilus: /a/yuriw-2020-02-15_16:49:25-rados-nautilus-distro-basic-smithi/4767419/
- 04:40 PM Backport #43650: nautilus: Improve upmap change reporting in logs
- 250a778fe8bd6eadf16fa1988403e0410c528543 will be in v14.2.8
- 03:46 PM Backport #44289 (Resolved): nautilus: mon: update + monmap update triggers spawn loop
- https://github.com/ceph/ceph/pull/34500
- 02:28 PM Backport #44206 (In Progress): nautilus: osd segv in ceph::buffer::v14_2_0::ptr::release (PGTempM...
- 02:23 PM Bug #44286 (New): Cache tiering shows unfound objects after OSD reboots
- We've got a cluster with a 3/2 size/min_size replicated cache pool in front of an erasure coded pool used for RBD.
... - 01:11 AM Bug #39525: lz4 compressor corrupts data when buffers are unaligned
- Might be a stretch, but I just noticed that our bits are flipped nearby the 128k boundary, which is ?coincidentally? ...
02/24/2020
- 10:38 PM Bug #24835 (Can't reproduce): osd daemon spontaneous segfault
- 07:42 PM Bug #44076 (Pending Backport): mon: update + monmap update triggers spawn loop
- 07:36 PM Bug #43048: nautilus: upgrade/mimic-x/stress-split: failed to recover before timeout expired
- https://github.com/ceph/ceph/pull/33470 - fixing the order of msgr2 vs nautilus install is the first step here.
- 05:48 PM Bug #44248: Receiving RemoteBackfillReserved in WaitLocalBackfillReserved can cause the osd to crash
- ...
- 04:25 PM Bug #44275 (Fix Under Review): NameError: name 'retval' is not defined
- 04:17 PM Bug #44275 (Resolved): NameError: name 'retval' is not defined
- ...
- 03:50 PM Bug #42830: problem returning mon to cluster
- I noticed there is very little osdmap caching in the leader mon -- here we see only 1 single osdmap in the mempool.
... - 05:45 AM Backport #44259 (In Progress): nautilus: Slow Requests/OP's types not getting logged
- 05:03 AM Backport #44259 (Resolved): nautilus: Slow Requests/OP's types not getting logged
- https://github.com/ceph/ceph/pull/33503
- 05:24 AM Bug #39525: lz4 compressor corrupts data when buffers are unaligned
- More ftr: the corruption occurs in the crush part of the osdmap:...
- 05:16 AM Bug #43975 (Pending Backport): Slow Requests/OP's types not getting logged
02/23/2020
- 10:08 PM Bug #43365: Nautilus: Random mon crashes in failed assertion at ceph::time_detail::signedspan
- Likely related....
- 09:29 PM Bug #43365: Nautilus: Random mon crashes in failed assertion at ceph::time_detail::signedspan
- Adding crash signature (cf2864eb1281dffc3340730dc2caae163b4c0170132bcbd3dcbd6147d8f29fa8) for the crash described in ...
- 09:05 PM Bug #43861: ceph_test_rados_watch_notify hang
- ...
- 02:29 PM Bug #41313: PG distribution completely messed up since Nautilus
- ...
- 12:13 PM Bug #39525: lz4 compressor corrupts data when buffers are unaligned
- A bit more about our incident ftr.
The cluster has 1301 osds in total: 752 filestore and 549 bluestore. The filest...
02/22/2020
- 04:47 PM Bug #44248 (Resolved): Receiving RemoteBackfillReserved in WaitLocalBackfillReserved can cause th...
- ...
- 01:25 PM Backport #44206: nautilus: osd segv in ceph::buffer::v14_2_0::ptr::release (PGTempMap::decode)
- Started a backport here https://github.com/ceph/ceph/pull/33483
- 09:49 AM Bug #39525: lz4 compressor corrupts data when buffers are unaligned
- > o->decode(obl); <------ HERE
I have gdb working now on a coredump so can confirm that:... - 01:00 AM Bug #39525: lz4 compressor corrupts data when buffers are unaligned
- ^^ Is a weird red-herring. The FFFFFFFF is because the osdmap contains the crc32c in the last 4 bytes, so that cancel...
- 01:23 AM Bug #43914: nautilus: ceph tell command times out
- This is on nautilus: /a/nojha-2020-02-21_20:34:10-upgrade:mimic-x:stress-split-nautilus-distro-basic-smithi/4788575/
... - 01:14 AM Bug #44062: LibRadosWatchNotify.WatchNotify failure
- /a/sage-2020-02-21_21:08:33-rados-wip-sage3-testing-2020-02-21-1218-distro-basic-smithi/4788714...
02/21/2020
- 10:48 PM Bug #39525: lz4 compressor corrupts data when buffers are unaligned
- Found something. The crc32c for all my *good* maps is FFFFFFFF (and I assure you they are different maps.. gsutil out...
- 10:16 PM Bug #39525: lz4 compressor corrupts data when buffers are unaligned
- Just to provide the same update I gave to Dan van der Ster over email:
IIRC, we saw this 1-2 times more after the ... - 09:42 PM Bug #39525: lz4 compressor corrupts data when buffers are unaligned
- This is continuing to happen for us. Log file here.
ceph-post-file: 589aa7aa-7a80-49a2-ba55-376e467c4550 - 10:19 PM Bug #42830: problem returning mon to cluster
- Seeing the same here in 13.2.8 starting a new empty mon. Leader's CPU goes to 100%, until an election is called then ...
- 09:03 PM Bug #44243 (Can't reproduce): memstore make check test fails
- ...
- 01:29 PM Bug #42328 (New): osd/PrimaryLogPG.cc: 3962: ceph_abort_msg("out of order op")
- It looks like this is still occurring even with a branch that included 8182f52149: http://qa-proxy.ceph.com/teutholo...
- 01:21 PM Bug #42347: nautilus assert during osd shutdown: FAILED ceph_assert((sharded_in_flight_list.back(...
- Bastian Mäuser wrote:
> This is still an issue on 14.2.6 (at least the one shipped with proxmox)
It will appear i... - 12:49 AM Bug #41240: All of the cluster SSDs aborted at around the same time and will not start.
- FTR this looks identical to https://tracker.ceph.com/issues/39525#note-6
- 12:25 AM Bug #44062: LibRadosWatchNotify.WatchNotify failure
- So the timeout, as previously mentioned, was 10 seconds although osd_default_notify_timeout is 30 seconds by default....
02/20/2020
- 07:02 PM Bug #39525: lz4 compressor corrupts data when buffers are unaligned
- ok, the first crash isn't becuase we just got bad data.. it's because we just read bad data off of disk. see:...
- 04:09 PM Bug #39525: lz4 compressor corrupts data when buffers are unaligned
- Notes from CERN incident:
- identical corruption, different OSDmaps on different OSDs:... - 05:40 PM Bug #44229 (New): monclient: _check_auth_rotating possible clock skew, rotating keys expired way ...
- seems to affect cephadm bootstrap tests
first, the error message doesn't make sense, since the bound 2020-02-20T16... - 12:20 PM Bug #44184: Slow / Hanging Ops after pool creation
- Neha Ojha wrote:
> Hi Wido,
>
> I did come across something like this while investigating https://tracker.ceph.co... - 12:42 AM Bug #44217 (Can't reproduce): Leaked connection (alloc from AsyncMessenger::add_accept)
- ...
02/19/2020
- 11:42 PM Bug #44076 (Fix Under Review): mon: update + monmap update triggers spawn loop
- 10:45 PM Bug #44157 (Resolved): cli throws bad exceptoin on control-c
- 10:11 PM Bug #44120 (Need More Info): NVMEDevice failed in certain NVMe Disk
- Can you attach logs from the crash? Which version are using?
- 10:08 PM Bug #44184 (Need More Info): Slow / Hanging Ops after pool creation
- Hi Wido,
I did come across something like this while investigating https://tracker.ceph.com/issues/43048. It was a... - 07:18 PM Bug #44184: Slow / Hanging Ops after pool creation
- On the Ceph users list there are multiple reports of people experiencing this:
- https://www.spinics.net/lists/cep... - 04:55 PM Bug #37656 (New): FileStore::_do_transaction() crashed with error 17 (merge collection vs osd res...
- /a/teuthology-2020-02-11_02:30:03-upgrade:mimic-x-nautilus-distro-basic-smithi/4753470/
upgrade:mimic-x/stress-spl... - 11:00 AM Bug #43151 (Resolved): ok-to-stop incorrect for some ec pgs
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 11:00 AM Bug #43721 (Resolved): qa/standalone/misc/ok-to-stop.sh occasionally fails
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 10:59 AM Backport #44206 (Resolved): nautilus: osd segv in ceph::buffer::v14_2_0::ptr::release (PGTempMap:...
- https://github.com/ceph/ceph/pull/33530
Also available in: Atom