Activity
From 04/14/2022 to 05/13/2022
05/13/2022
- 10:49 PM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
- I reproduced the symptoms of this bug locally by incrementing the notify count before an eq check. The extra incremen...
- 09:29 AM Backport #55067: octopus: osd_fast_shutdown_notify_mon option should be true by default
- My test result:...
05/12/2022
- 11:10 PM Backport #55633 (In Progress): octopus: ceph-osd takes all memory before oom on boot
- https://github.com/ceph/ceph/pull/46253
- 06:07 PM Backport #55633 (Rejected): octopus: ceph-osd takes all memory before oom on boot
- 10:56 PM Backport #52077: octopus: api_tier_pp: [ FAILED ] LibRadosTwoPoolsPP.HitSetWrite
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/45320
merged - 10:42 PM Backport #55631 (In Progress): pacific: ceph-osd takes all memory before oom on boot
- https://github.com/ceph/ceph/pull/46252
- 06:06 PM Backport #55631 (Resolved): pacific: ceph-osd takes all memory before oom on boot
- 10:39 PM Backport #55632 (In Progress): quincy: ceph-osd takes all memory before oom on boot
- https://github.com/ceph/ceph/pull/46251
- 06:06 PM Backport #55632 (Resolved): quincy: ceph-osd takes all memory before oom on boot
- 06:52 PM Bug #55559: osd-backfill-stats.sh fails in TEST_backfill_ec_prim_out
- Hello Laura! Is there a thing that makes you think this isn't a duplicate of #47026?
- 06:48 PM Bug #49689: osd/PeeringState.cc: ceph_abort_msg("past_interval start interval mismatch") start
- If more date is necessary, it might be worth no contact Richard Bateman who replicated something awfully similar to t...
- 06:29 PM Bug #55582: octopus: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails because `rados_w...
- Yet another in in the family of Watch / Notify ENOENT -> ENOTCONN bugs.
- 06:28 PM Bug #44229: monclient: _check_auth_rotating possible clock skew, rotating keys expired way too early
- ...
- 06:25 PM Bug #44229 (New): monclient: _check_auth_rotating possible clock skew, rotating keys expired way ...
- Perhaps this replicated in:
/home/teuthworker/archive/yuriw-2022-04-29_15:44:49-rados-wip-yuri5-testing-2022-04-28... - 06:28 PM Bug #49591: no active mgr (MGR_DOWN)" in cluster log
- I can't find @Degraded data redundancy@ in the mgr's log but I can find messages about expired cephx keys:...
- 06:09 PM Bug #52993: upgrade:octopus-x Test: Upgrade test failed due to timeout of the "ceph pg dump" command
- We haven't backported the fix for https://tracker.ceph.com/issues/51815 to Octopus (per Neha's explanation).
- 06:02 PM Bug #47299: Assertion in pg_missing_set: p->second.need <= v || p->second.is_delete()
- Hello! A note from a bug scrub:
1. This issue looks like being caused by a particular data stored in OSD which
2.... - 09:33 AM Bug #47299: Assertion in pg_missing_set: p->second.need <= v || p->second.is_delete()
- ...
- 05:53 PM Bug #48440 (Need More Info): log [ERR] : scrub mismatch
- We would need to ensure the latest reoccurence is about the OSD scrub (we haven't seen too many mon scrubbing issues ...
- 05:45 PM Bug #53729 (Pending Backport): ceph-osd takes all memory before oom on boot
- 02:29 PM Backport #55624 (In Progress): quincy: Unable to format `ceph config dump` command output in yaml...
- 02:26 PM Backport #55624 (Resolved): quincy: Unable to format `ceph config dump` command output in yaml us...
- https://github.com/ceph/ceph/pull/46246
- 02:25 PM Bug #53895 (Pending Backport): Unable to format `ceph config dump` command output in yaml using `...
- 10:12 AM Backport #55067: octopus: osd_fast_shutdown_notify_mon option should be true by default
- I think we should also use failure_pending queue like send_failures to avoid one osd sending target osd to mon multip...
- 09:35 AM Backport #55067: octopus: osd_fast_shutdown_notify_mon option should be true by default
- ...
- 06:08 AM Backport #55067: octopus: osd_fast_shutdown_notify_mon option should be true by default
When a node is actively shut down for operation and maintenance,
the osd/mon/mds process on it will automatically...- 05:55 AM Backport #55067: octopus: osd_fast_shutdown_notify_mon option should be true by default
- Nitzan Mordechai wrote:
> jianwei zhang wrote:
> > osd_fast_shutdown(true)
> > osd_fast_shutdown_notify_mon(false)...
05/11/2022
- 08:59 PM Backport #54568: octopus: mon/MonCommands.h: target_size_ratio range is incorrect
- Kamoltat Sirivadhna wrote:
> https://github.com/ceph/ceph/pull/45398
merged - 08:46 PM Backport #55012: octopus: librados: check latest osdmap on ENOENT in pool_reverse_lookup()
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/45587
merged - 08:12 PM Backport #53550: octopus: [RFE] Provide warning when the 'require-osd-release' flag does not matc...
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/44260
merged - 04:58 PM Backport #52078 (Resolved): pacific: api_tier_pp: [ FAILED ] LibRadosTwoPoolsPP.HitSetWrite
- 04:58 PM Backport #55047 (Resolved): quincy: rados/test.sh hangs while running LibRadosTwoPoolsPP.Manifest...
- 04:58 PM Backport #55439 (Resolved): quincy: FAILED ceph_assert due to issue manifest API to the original ...
- 04:56 PM Backport #54468 (Resolved): octopus: Setting osd_pg_max_concurrent_snap_trims to 0 prematurely cl...
- 04:15 PM Backport #54468: octopus: Setting osd_pg_max_concurrent_snap_trims to 0 prematurely clears the sn...
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/45324
merged - 04:56 PM Backport #55074 (Resolved): octopus: osd: osd_fast_shutdown_notify_mon not quite right
- 04:13 PM Backport #55074: octopus: osd: osd_fast_shutdown_notify_mon not quite right
- Laura Flores wrote:
> https://github.com/ceph/ceph/pull/45655
merged - 04:15 PM Bug #54592: partial recovery: CEPH_OSD_OP_OMAPRMKEYRANGE should mark omap dirty
- https://github.com/ceph/ceph/pull/45593 merged
- 03:53 PM Bug #52993: upgrade:octopus-x Test: Upgrade test failed due to timeout of the "ceph pg dump" command
- Similar problem happened on a rados/singleton test for Octopus:
/a/yuriw-2022-04-26_20:58:55-rados-wip-yuri2-testi... - 05:22 AM Bug #48440: log [ERR] : scrub mismatch
- /home/teuthworker/archive/yuriw-2022-04-29_15:44:49-rados-wip-yuri5-testing-2022-04-28-1007-distro-default-smithi/681...
- 05:20 AM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
- /home/teuthworker/archive/yuriw-2022-04-29_15:44:49-rados-wip-yuri5-testing-2022-04-28-1007-distro-default-smithi/681...
- 05:18 AM Bug #49591: no active mgr (MGR_DOWN)" in cluster log
- /home/teuthworker/archive/yuriw-2022-04-29_15:44:49-rados-wip-yuri5-testing-2022-04-28-1007-distro-default-smithi/681...
05/10/2022
- 12:38 PM Backport #53971 (In Progress): octopus: BufferList.rebuild_aligned_size_and_memory failure
- https://github.com/ceph/ceph/pull/46216
- 12:32 PM Backport #53972 (In Progress): pacific: BufferList.rebuild_aligned_size_and_memory failure
- https://github.com/ceph/ceph/pull/46215
- 04:38 AM Bug #53327: osd: osd_fast_shutdown_notify_mon not quite right and enable osd_fast_shutdown_notify...
- they did: https://tracker.ceph.com/issues/55074
- 01:59 AM Bug #53327: osd: osd_fast_shutdown_notify_mon not quite right and enable osd_fast_shutdown_notify...
- octopus: osd/OSD: osd_fast_shutdown_notify_mon not quite right #45655
https://github.com/ceph/ceph/pull/45655/commit... - 04:31 AM Backport #55067: octopus: osd_fast_shutdown_notify_mon option should be true by default
- jianwei zhang wrote:
> osd_fast_shutdown(true)
> osd_fast_shutdown_notify_mon(false)
> osd_mon_shutdown_timeout(5... - 12:55 AM Backport #55067: octopus: osd_fast_shutdown_notify_mon option should be true by default
- osd_fast_shutdown(true)
osd_fast_shutdown_notify_mon(false)
osd_mon_shutdown_timeout(5s) --> cannot send MOSDMar... - 12:49 AM Backport #55067: octopus: osd_fast_shutdown_notify_mon option should be true by default
- ...
- 12:44 AM Backport #55067: octopus: osd_fast_shutdown_notify_mon option should be true by default
- mon.a/c has millions of osd_failure (immediate+timeout). There should be messages forwarded by mon.c.
- 12:41 AM Backport #55067: octopus: osd_fast_shutdown_notify_mon option should be true by default
- ceph version: v15.2.13
I found a problem with the mon election, which should be related to it.
Test steps when ... - 01:02 AM Bug #53328 (Duplicate): osd_fast_shutdown_notify_mon option should be true by default
05/09/2022
- 04:47 PM Bug #55582 (New): octopus: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails because `r...
- /a/lflores-2022-05-09_14:54:06-rados-wip-55077-octopus-distro-default-smithi/6828789...
- 04:10 PM Bug #48793: out of order op
- @Neha @Ronen this Octopus failure looks a lot like this Tracker. Was the revised scrub code backported to Octopus, or...
- 03:37 PM Backport #55581 (Rejected): octopus: api_list: LibRadosList.EnumerateObjects and LibRadosList.Enu...
- 03:35 PM Bug #52553: pybind: rados.RadosStateError raised when closed watch object goes out of scope after...
- /a/lflores-2022-05-04_18:59:38-rados-wip-55077-octopus-distro-default-smithi/6821227...
- 03:31 PM Bug #48899 (Pending Backport): api_list: LibRadosList.EnumerateObjects and LibRadosList.Enumerate...
- /a/lflores-2022-05-04_18:59:38-rados-wip-55077-octopus-distro-default-smithi/6820998...
- 11:43 AM Bug #54182: OSD_TOO_MANY_REPAIRS cannot be cleared in >=Octopus
- I just observed this issue once more and forgot to drop the info that a restart of an OSD actually resets this counte...
- 10:08 AM Bug #50608: ceph_assert(is_primary()) in PrimaryLogPG::on_local_recover
- (upgrade and restart OSDs is probably more accurate wording). If I upgrade node #2 and OSD on node #1 would die with ...
- 10:07 AM Bug #50608: ceph_assert(is_primary()) in PrimaryLogPG::on_local_recover
- Always happens when you upgrade nodes, probably some timing issue with PGs going or flapping primary. I never have de...
- 10:06 AM Bug #50608: ceph_assert(is_primary()) in PrimaryLogPG::on_local_recover
- 'virtual void PrimaryLogPG::on_local_recover(const hobject_t&, const ObjectRecoveryInfo&, ObjectContextRef, bool, Obj...
- 07:32 AM Bug #55573: stretch mode: be more sane about changing different size/min_size
- Realized my suggestion/formula in the mailing list wasn't good :)
This is what I intended originally:
- degraded ...
05/06/2022
- 04:46 PM Bug #55573 (New): stretch mode: be more sane about changing different size/min_size
- From the mailing list:
I created 2 aditional pools each with a matching stretch rule:
- size=2/min=1 (not advised... - 01:01 AM Bug #55549: OSDs crashing
- After days of fighting this (it's on a production cluster) I finally gave up on the least important of the pools -- t...
05/05/2022
- 04:23 PM Bug #47025: rados/test.sh: api_watch_notify_pp LibRadosWatchNotifyECPP.WatchNotify failed
- This is from the 16.2.8 run.
/a/yuriw-2022-05-04_20:09:21-rados-pacific-distro-default-smithi/6821705... - 03:22 PM Backport #55439: quincy: FAILED ceph_assert due to issue manifest API to the original object
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/46061
merged - 03:20 PM Backport #55047: quincy: rados/test.sh hangs while running LibRadosTwoPoolsPP.ManifestFlushDupCount
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/45624
merged - 03:13 PM Bug #55559 (Duplicate): osd-backfill-stats.sh fails in TEST_backfill_ec_prim_out
- /a/yuriw-2022-04-28_14:23:18-rados-wip-yuri-testing-2022-04-27-1456-quincy-distro-default-smithi/6811107...
- 01:37 PM Bug #53729: ceph-osd takes all memory before oom on boot
- Can I kindly ask if there's an estimate when this will be fixed and backported? We have customers that have been in t...
05/04/2022
- 10:20 PM Bug #55549 (Resolved): OSDs crashing
- My apologies if this is the wrong project; I'm so lost on this particular issue that I'm not even sure where to ask f...
- 08:11 PM Bug #55407: quincy osd's fail to boot and crash
- It seems I messed up everything... Let me startover.
I have a ceph cluster running since looooong time ago. Recent... - 05:56 PM Bug #55407: quincy osd's fail to boot and crash
- Gonzalo Aguilar Delgado wrote:
> It doesn't matter. This is just a side effect. I mean... The bug is not caused by t... - 05:52 PM Bug #53789: CommandFailedError (rados/test_python.sh): "RADOS object not found" causes test_rados...
- I think the lack of @-2@ (@ENOENT@) **might** be caused the errno normalization @Objecter@ has.
- 07:42 AM Bug #53789: CommandFailedError (rados/test_python.sh): "RADOS object not found" causes test_rados...
- I hit another issue when we have socket failure injection active when running the tests. I think this is not only the...
- 05:43 PM Bug #52624: qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)"
- To judge how severe the problem really is we need the information whether the stall is permanent (PG gets stuck and t...
- 01:35 PM Bug #52624: qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)"
- "These PG_AVAILBILITY warnings are frequently seen with snap-schedule teuthology jobs.":https://pulpito.ceph.com/mcha...
- 05:37 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
- Just the record: we suspect the issue is related to the error injection in async-msgr. Some runs without them are sup...
- 11:30 AM Backport #55543 (Resolved): quincy: should use TCMalloc for better performance
- https://github.com/ceph/ceph/pull/47927
- 11:30 AM Backport #55542 (Rejected): octopus: should use TCMalloc for better performance
- 11:30 AM Backport #55541 (Rejected): pacific: should use TCMalloc for better performance
- https://github.com/ceph/ceph/pull/51282
- 11:29 AM Bug #55519 (Pending Backport): should use TCMalloc for better performance
- 06:13 AM Documentation #46120 (Resolved): Improve ceph-objectstore-tool documentation
- This issue has been resolved. The ceph-objectstore-tool documentation now exists, and there's even a good manpage.
...
05/03/2022
- 07:48 PM Bug #54509: FAILED ceph_assert due to issue manifest API to the original object
- Myoungwon Oh wrote:
> https://github.com/ceph/ceph/pull/46120
Thanks for looking into it and creating the backport! - 02:18 AM Bug #54509: FAILED ceph_assert due to issue manifest API to the original object
- https://github.com/ceph/ceph/pull/46120
- 01:40 AM Bug #54509: FAILED ceph_assert due to issue manifest API to the original object
- I think this is the same issue as https://tracker.ceph.com/issues/50806.
This issue was already fixed, but not backp... - 01:07 AM Bug #54509: FAILED ceph_assert due to issue manifest API to the original object
- Sure.
- 07:46 PM Backport #50893 (In Progress): pacific: osd/PrimaryLogPG.cc: FAILED ceph_assert(attrs || !recover...
- 06:26 PM Backport #55019: octopus: partial recovery: CEPH_OSD_OP_OMAPRMKEYRANGE should mark omap dirty
- Christian Rohmann wrote:
> Sorry for being a nag ... I initially reported https://tracker.ceph.com/issues/53663 and ... - 04:41 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
- 玮文 胡 wrote:
> Maybe we should fix the release note(https://docs.ceph.com/en/latest/releases/quincy/) first? The work... - 04:32 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
- 玮文 胡 wrote:
> https://github.com/ceph/ceph/pull/46124
>
> Tested locally with
>
> [...]
Thank you. - 04:27 PM Bug #55383 (Fix Under Review): monitor cluster logs(ceph.log) appear empty until rotated
- 09:55 AM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
- Maybe we should fix the release note(https://docs.ceph.com/en/latest/releases/quincy/) first? The workaround there is...
- 09:44 AM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
- https://github.com/ceph/ceph/pull/46124
Tested locally with... - 03:49 PM Bug #51076: "wait_for_recovery: failed before timeout expired" during thrashosd test with EC back...
- Ran 100 thrash-erasure-code-big tests in octopus, and the `wait_for_recovery` assertion occurred 18/100 times, with 1...
- 11:19 AM Bug #55407: quincy osd's fail to boot and crash
- It doesn't matter. This is just a side effect. I mean... The bug is not caused by the tool.
The bug is caused becau... - 07:23 AM Bug #55519 (Fix Under Review): should use TCMalloc for better performance
- 07:22 AM Bug #55519 (Resolved): should use TCMalloc for better performance
- we had been using TCMalloc in older releases. but somehow, we stopped doing so. let's bring it back.
05/02/2022
- 06:50 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
- To me looks like this is the problem?
https://github.com/ceph/ceph/commit/7c84e06e6f846f6b4b6fd959218b4d474520f429... - 05:29 PM Bug #54509: FAILED ceph_assert due to issue manifest API to the original object
- Myoungwon Oh: Seeing this in pacific as well, can you confirm if it is the same issue?
/a/yuriw-2022-04-30_17:01:... - 02:12 PM Bug #53789 (In Progress): CommandFailedError (rados/test_python.sh): "RADOS object not found" cau...
- 01:11 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
- Scheduled another run with just the rados/verify test that failed and I can see this happen frequently:
/a/amathu... - 01:07 PM Backport #55513 (In Progress): quincy: mount.ceph fails to understand AAAA records from SRV record
- 12:57 PM Backport #55513 (Resolved): quincy: mount.ceph fails to understand AAAA records from SRV record
- https://github.com/ceph/ceph/pull/46113
- 01:03 PM Backport #55514 (In Progress): pacific: mount.ceph fails to understand AAAA records from SRV record
- 12:57 PM Backport #55514 (Resolved): pacific: mount.ceph fails to understand AAAA records from SRV record
- https://github.com/ceph/ceph/pull/46112
- 12:52 PM Bug #47300 (Pending Backport): mount.ceph fails to understand AAAA records from SRV record
- 08:34 AM Bug #47300 (Resolved): mount.ceph fails to understand AAAA records from SRV record
05/01/2022
- 05:40 AM Bug #43887 (Fix Under Review): ceph_test_rados_delete_pools_parallel failure
- https://github.com/ceph/ceph/pull/46099
04/28/2022
- 09:29 PM Bug #55488 (New): ENOENT on clone on EC non-primary shard
- ...
- 06:59 PM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
- /a/yuriw-2022-04-27_02:52:22-rados-pacific-distro-default-smithi/6807766...
04/27/2022
- 09:51 PM Backport #55439 (In Progress): quincy: FAILED ceph_assert due to issue manifest API to the origin...
- 09:25 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
- Laura Flores wrote:
> This one looks somewhat different from the other reported failures. First of all, it failed on... - 05:37 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
- Let's discuss this on the next RADOS Team Meeting.
- 06:07 PM Bug #55424 (Won't Fix): ceph-mon process exit in dead status , which backtrace displayed has bloc...
- Sorry, the version is EOL :-(.
- 06:06 PM Bug #55419 (Resolved): cephtool/test.sh: failure on blocklist testing
- 05:58 PM Bug #55440: osd-scrub-test.sh: TEST_scrub_test failed due to inconsistent PG
- ...
- 05:56 PM Bug #49754: osd/OSD.cc: ceph_abort_msg("abort() called") during OSD::shutdown()
- Laura Flores wrote:
> /a/yuriw-2022-04-26_00:11:14-rados-wip-55324-pacific-backport-distro-default-smithi/6805265/re... - 05:49 PM Bug #55407: quincy osd's fail to boot and crash
- Hello Gonzalo!
Just a quick note from a bug srub: we don't support mixing the tool from a newer release with OSDs fr... - 05:39 PM Bug #51076: "wait_for_recovery: failed before timeout expired" during thrashosd test with EC back...
- This was discussed in the rados meeting this week. Laura is trying to check if the bug exists in Octopus or not, to h...
- 05:34 PM Bug #55433 (Closed): common: FAILED ceph_assert(((lock).is_locked()))
- The fix will be merged with the original PR.
- 10:24 AM Bug #47300 (Fix Under Review): mount.ceph fails to understand AAAA records from SRV record
04/26/2022
- 03:58 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
- This one looks somewhat different from the other reported failures. First of all, it failed on a rados/verify test, n...
- 02:33 PM Bug #49754: osd/OSD.cc: ceph_abort_msg("abort() called") during OSD::shutdown()
- /a/yuriw-2022-04-26_00:11:14-rados-wip-55324-pacific-backport-distro-default-smithi/6805265/remote/smithi061/crash/20...
- 10:27 AM Bug #55450 (Resolved): [DOC] stretch_rule defined in the doc needs updation
- in section [1], the stretch_rule defined to be added to the crush map needs to be updated.
min size and max size par...
04/25/2022
- 10:11 PM Bug #55433: common: FAILED ceph_assert(((lock).is_locked()))
- https://github.com/ceph/ceph/pull/46028 has been merged to unblock other master PR merges.
- 06:28 PM Bug #55433 (Fix Under Review): common: FAILED ceph_assert(((lock).is_locked()))
- 05:36 PM Bug #55433 (Closed): common: FAILED ceph_assert(((lock).is_locked()))
- Seen in jenkins make check tests, i.e. https://jenkins.ceph.com/job/ceph-pull-requests/94227/console...
- 09:59 PM Bug #55440 (New): osd-scrub-test.sh: TEST_scrub_test failed due to inconsistent PG
- /a/yuriw-2022-04-22_13:56:48-rados-wip-yuri2-testing-2022-04-22-0500-distro-default-smithi/6800338...
- 07:31 PM Bug #44595: cache tiering: Error: oid 48 copy_from 493 returned error code -2
- /a/yuriw-2022-04-25_14:14:44-rados-wip-yuri3-testing-2022-04-22-0534-quincy-distro-default-smithi/6805186...
- 07:07 PM Bug #51076: "wait_for_recovery: failed before timeout expired" during thrashosd test with EC back...
- /a/yuriw-2022-04-22_21:06:04-rados-wip-yuri3-testing-2022-04-22-0534-quincy-distro-default-smithi/6802072
- 07:00 PM Backport #55439 (Resolved): quincy: FAILED ceph_assert due to issue manifest API to the original ...
- https://github.com/ceph/ceph/pull/46061
- 06:59 PM Bug #54509 (Pending Backport): FAILED ceph_assert due to issue manifest API to the original object
- 06:58 PM Bug #54509: FAILED ceph_assert due to issue manifest API to the original object
- /a/yuriw-2022-04-22_21:06:04-rados-wip-yuri3-testing-2022-04-22-0534-quincy-distro-default-smithi/6802065/remote/smit...
- 06:56 PM Bug #55435: mon/Elector: notify_ranked_removed() does not properly erase dead_ping in the case of...
- In an example scenario where we have 5 monitors:
rank_size = 5
mon.a (rank 0)
mon.b (rank 1)
mon.c (rank 2)
mo... - 06:29 PM Bug #55435 (Resolved): mon/Elector: notify_ranked_removed() does not properly erase dead_ping in ...
- 05:54 PM Bug #55407: quincy osd's fail to boot and crash
- Gonzalo Aguilar Delgado wrote:
> Neha Ojha wrote:
> > Did you see the same segmentation fault in quincy and pacific... - 05:50 PM Bug #55407: quincy osd's fail to boot and crash
- The situation is even worse. Any osd created with ceph version 17.1.0 (c675060073a05d40ef404d5921c81178a52af6e0) quin...
- 06:23 AM Bug #55407: quincy osd's fail to boot and crash
- I managed to reproduce...
I install an OSD with the pacific version. Then I let it run for a while (10 min or so)... - 05:51 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
- Looks like it is happening because of mon/LogMonitor changing it back to RADOS.
- 05:49 PM Bug #55383 (Triaged): monitor cluster logs(ceph.log) appear empty until rotated
- 05:49 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
- If you are okay can you please send a quick fix?
- 05:48 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
- 玮文 胡 wrote:
> I suspect this issue is due to https://github.com/ceph/ceph/commit/7c84e06e6f846f6b4b6fd959218b4d47452... - 05:19 PM Bug #55419 (Fix Under Review): cephtool/test.sh: failure on blocklist testing
- 03:23 PM Bug #54458: osd-scrub-snaps.sh: TEST_scrub_snaps failed due to malformed log message
- Perhaps this has resurfaced?
/a/yuriw-2022-04-22_13:56:48-rados-wip-yuri2-testing-2022-04-22-0500-distro-default-s... - 09:42 AM Bug #55424 (Won't Fix): ceph-mon process exit in dead status , which backtrace displayed has bloc...
- plz see abc.png
LevelDBstore::close
set thread quit flag, compact_queue_stop = true.
then send signal ...
04/23/2022
- 09:59 AM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
- I suspect this issue is due to https://github.com/ceph/ceph/commit/7c84e06e6f846f6b4b6fd959218b4d474520f429 and have ...
- 12:04 AM Bug #55419 (In Progress): cephtool/test.sh: failure on blocklist testing
04/22/2022
- 10:00 PM Bug #52153: crash: int MonitorDBStore::apply_transaction(MonitorDBStore::TransactionRef): abort
- I have also seen this crash on my monitor running 16.2.7.
- 09:24 PM Bug #55407: quincy osd's fail to boot and crash
- I saw the stacktrace. This time v17.2.0. Latest...
- 09:20 PM Bug #55407: quincy osd's fail to boot and crash
- Ok. This is the situation:
1.- OSD built from scracth in pacific. (docker pull ceph/daemon:latest-pacific)(
2.- U... - 08:47 PM Bug #55407: quincy osd's fail to boot and crash
- Igor Fedotov wrote:
> >2022-04-22T13:34:42.419+0000 7fd5798ed080 -1 bluefs _replay 0x11000: stop: unrecognized op 12... - 03:36 PM Bug #55407: quincy osd's fail to boot and crash
- >2022-04-22T13:34:42.419+0000 7fd5798ed080 -1 bluefs _replay 0x11000: stop: unrecognized op 12
@Gonzalo, AFAIU you... - 01:38 PM Bug #55407: quincy osd's fail to boot and crash
- Neha Ojha wrote:
> Did you see the same segmentation fault in quincy and pacific? Were you testing a custom build of... - 09:21 PM Bug #55419 (Resolved): cephtool/test.sh: failure on blocklist testing
- /a/yuriw-2022-04-22_13:56:48-rados-wip-yuri2-testing-2022-04-22-0500-distro-default-smithi/6800292...
- 07:52 PM Bug #24057 (Rejected): cbt fails to copy results to the archive dir
- 06:27 PM Bug #43189 (Resolved): pgs stuck in laggy state
- 06:27 PM Backport #43232 (Rejected): nautilus: pgs stuck in laggy state
- Nautilus is EOL
- 06:26 PM Bug #41385 (Resolved): osd/ReplicatedBackend.cc: 1349: FAILED ceph_assert(peer_missing.count(from...
- 06:26 PM Backport #41731 (Rejected): nautilus: osd/ReplicatedBackend.cc: 1349: FAILED ceph_assert(peer_mis...
- Nautilus is EOL
- 02:46 PM Backport #55405 (In Progress): quincy: librados C++ API requires C++17 to build
- https://github.com/ceph/ceph/pull/46005
- 02:41 PM Backport #55406 (In Progress): pacific: librados C++ API requires C++17 to build
- https://github.com/ceph/ceph/pull/46004
04/21/2022
- 11:11 PM Bug #55407 (Need More Info): quincy osd's fail to boot and crash
- Did you see the same segmentation fault in quincy and pacific? Were you testing a custom build of ceph (17.1.0 is a d...
- 08:20 PM Bug #55407 (Rejected): quincy osd's fail to boot and crash
- I have a cluster with pacific. One of the osd started to crash...
So I zapped the disk and recreated again. I foun... - 08:21 PM Bug #53729: ceph-osd takes all memory before oom on boot
- Gonzalo Aguilar Delgado wrote:
> I suppose this thread can be closed as soon as the fix is in master. But just for r... - 06:10 PM Bug #53729: ceph-osd takes all memory before oom on boot
- I suppose this thread can be closed as soon as the fix is in master. But just for reference, in case has something to...
- 05:01 PM Bug #53729: ceph-osd takes all memory before oom on boot
- Mykola Golub wrote:
> Gonzalo Aguilar Delgado wrote:
>
> > Mykola, specially thank you for doing the patch.
>
... - 04:39 PM Bug #53729: ceph-osd takes all memory before oom on boot
- Gonzalo Aguilar Delgado wrote:
> Mykola, specially thank you for doing the patch.
I am not the author of the pa... - 04:34 PM Bug #53729: ceph-osd takes all memory before oom on boot
- Yesssss!!! Great job team!
It's up & running. It purged dups, booted the ceph-osd and only 1/2Gb RAM full booted. ... - 04:23 PM Bug #53729: ceph-osd takes all memory before oom on boot
- Mykola Golub wrote:
> Gonzalo Aguilar Delgado wrote:
>
> > CEPH_ARGS="--osd_pg_log_trim_max=10000 --osd_max_pg_lo... - 07:30 AM Bug #53729: ceph-osd takes all memory before oom on boot
- Gonzalo Aguilar Delgado wrote:
> CEPH_ARGS="--osd_pg_log_trim_max=10000 --osd_max_pg_log_entries=2000 " LD_LIBRARY... - 07:01 AM Bug #53729: ceph-osd takes all memory before oom on boot
- Nitzan Mordechai wrote:
> Gonzalo Aguilar Delgado wrote:
> > Nitzan Mordechai wrote:
> > > Can you please add the ... - 07:28 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
- - Just for completeness as expected this issue again happed today in gibba cluster during the log rotation window and...
- 05:14 PM Feature #54115 (In Progress): Log pglog entry size in OSD log if it exceeds certain size limit
- 04:25 PM Backport #55406 (Rejected): pacific: librados C++ API requires C++17 to build
- https://github.com/ceph/ceph/pull/46004
- 04:25 PM Backport #55405 (In Progress): quincy: librados C++ API requires C++17 to build
- 04:22 PM Bug #55233: librados C++ API requires C++17 to build
- The c++ api was created only for internal use. It should not be held to such a guarantee. At least, that's what I u...
- 04:20 PM Bug #55233 (Pending Backport): librados C++ API requires C++17 to build
- 03:31 PM Feature #53050 (Pending Backport): Support blocklisting a CIDR range
- 03:31 PM Feature #53050 (Resolved): Support blocklisting a CIDR range
- 12:21 PM Feature #55402 (New): rgw: Add dbstore & cloud-transition test-suites to teuthology
- Add new test-suites to teuthology for below RGW features -
* cloud-transition
* dbstore backend - 03:40 AM Bug #55355: osd thread deadlock
- Thanks for your reply @Radoslaw Zarzynski.
I checked the latest code and found that the code logic is the same. I th...
04/20/2022
- 09:08 PM Bug #51076: "wait_for_recovery: failed before timeout expired" during thrashosd test with EC back...
- Looking further into this issue, it looks like the bug occurs whenever "prep_object_replica_pushes()" is called, and ...
- 08:22 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
- Neha noticed today that in the LRC cluster after having this workaround still, this cluster when went through a log r...
- 06:50 PM Bug #49231: MONs unresponsive over extended periods of time
- Mimic is EOL :-(. Would you be able to upgrade soon?
- 06:43 PM Bug #55101 (New): mon has slow op
- 06:33 PM Bug #55255 (Need More Info): "ceph iostat" exception!
- 06:33 PM Bug #55255: "ceph iostat" exception!
- Hello! How do you synchronize the clocks in your cluster? Is NTP properly running?
I'm asking about that to ensure... - 06:24 PM Bug #47025: rados/test.sh: api_watch_notify_pp LibRadosWatchNotifyECPP.WatchNotify failed
- @Laura, @Nitzan: the assertion failure in the comment #7 is about @reply_map.size()@ while other occurrences mention ...
- 06:06 PM Bug #50608: ceph_assert(is_primary()) in PrimaryLogPG::on_local_recover
- Hello! Looks it's reproducible which is good.
Would you be able to provide logs with extra debugs as mentioned in ht... - 06:02 PM Bug #55355: osd thread deadlock
- Hello! @14.2.22@ is out-of-live actually. Would you be able to verify the issue on a newer one?
- 09:53 AM Bug #55355: osd thread deadlock
- I find thread 45 wants to stop connection,but the connection has been stopped by thread 71
@
(gdb) f 5
#5 Async... - 05:57 PM Bug #51463 (Resolved): blocked requests while stopping/starting OSDs
- 05:55 PM Bug #52948: osd: fails to come up: "teuthology.misc:7 of 8 OSDs are up"
- Lowering the priority as the last replication is 5 months old.
- 05:51 PM Bug #53924: EC PG stuckrecovery_unfound+undersized+degraded+remapped+peered
- Good to know and thanks for your testing!
Just to the record: leaving the bug in the @Need More Info@ state as the... - 04:37 PM Bug #53729: ceph-osd takes all memory before oom on boot
- Gonzalo Aguilar Delgado wrote:
> Nitzan Mordechai wrote:
> > Can you please add the output of trim-pg-log ?
> > CE... - 04:23 PM Bug #53729: ceph-osd takes all memory before oom on boot
- Mykola Golub wrote:
> Just as information that might be useful for someone. Although ceph-objectstore-tool is a more... - 04:21 PM Bug #53729: ceph-osd takes all memory before oom on boot
- Nitzan Mordechai wrote:
> Can you please add the output of trim-pg-log ?
> CEPH_ARGS="--osd_pg_log_trim_max=10000 -... - 02:47 PM Bug #53729: ceph-osd takes all memory before oom on boot
- Just as information that might be useful for someone. Although ceph-objectstore-tool is a more reliable way to confir...
- 12:03 PM Bug #53729: ceph-osd takes all memory before oom on boot
- Can you please add the output of trim-pg-log ?
CEPH_ARGS="--osd_pg_log_trim_max=10000 --osd_max_pg_log_entries=2000 ... - 11:06 AM Bug #53729: ceph-osd takes all memory before oom on boot
- After a while it crashed...
-34> 2022-04-20T11:02:25.218+0000 7f72b3b83640 5 rocksdb: commit_cache_size High Pri... - 10:50 AM Bug #53729: ceph-osd takes all memory before oom on boot
- I've built the repo from git@github.com:NitzanMordhai/ceph.git branch origin/wip-nitzan-pglog-dups-not-trimmed
And...
04/19/2022
- 11:45 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
- we found out that quincy has https://github.com/ceph/ceph/pull/40640 log_to_journald feature. When we set ...
- 05:48 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
- Tim Wilkinson wrote:
> While executing tests on 17.1.0-203-g2c8d01fc, I see the ceph.log files on all MONs are zero ... - 04:56 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
- Gibba cluster quincy version `17.1.0-163-g4e244311`.
- 04:52 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
- I think we have not seen this issue before in previous quincy builds? something changed recently.
- 04:51 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
- Tim Wilkinson wrote:
> While executing tests on 17.1.0-203-g2c8d01fc, I see the ceph.log files on all MONs are zero ... - 04:43 PM Bug #55383 (Resolved): monitor cluster logs(ceph.log) appear empty until rotated
- While executing tests on 17.1.0-203-g2c8d01fc, I see the ceph.log files on all MONs are zero length unless rotated.
... - 08:13 AM Backport #55019: octopus: partial recovery: CEPH_OSD_OP_OMAPRMKEYRANGE should mark omap dirty
- Sorry for being a nag ... I initially reported https://tracker.ceph.com/issues/53663 and still observe the issues of ...
04/18/2022
- 12:55 PM Bug #55355 (Resolved): osd thread deadlock
- my ceph version is 14.2.22
After the network is abnormal, the osd cannot join the cluster.
Then I find the osd th...
04/16/2022
- 08:18 AM Bug #50608: ceph_assert(is_primary()) in PrimaryLogPG::on_local_recover
- 2022-04-16T00:06:06.526+0200 7f6997402700 -1 osd.110 1166753 heartbeat_check: no reply from <censor>:6812 osd.109 sin...
- 08:15 AM Bug #50608: ceph_assert(is_primary()) in PrimaryLogPG::on_local_recover
- There is a lot of lines like this before the crash line in the log file
-24> 2022-04-16T00:06:02.105+0200 7fede... - 08:14 AM Bug #50608: ceph_assert(is_primary()) in PrimaryLogPG::on_local_recover
- 2022-04-16T00:06:08.540+0200 7fedde264700 0 log_channel(cluster) log [WRN] : Monitor daemon marked osd.109 down, but...
- 08:09 AM Bug #50608: ceph_assert(is_primary()) in PrimaryLogPG::on_local_recover
- OSD crashed with this again
{
"archived": "2022-04-15 23:07:21.580173",
"assert_condition": "is_primary(...
04/15/2022
- 02:08 AM Bug #53924: EC PG stuckrecovery_unfound+undersized+degraded+remapped+peered
- the pacth
osd/PeeringState: fix acting_set_writeable min_size check
can resolve ceph v15.2.13 recovery_unfoun... - 02:07 AM Bug #53924: EC PG stuckrecovery_unfound+undersized+degraded+remapped+peered
- jianwei zhang wrote:
> Radoslaw Zarzynski wrote:
> > > the all osds is up&in, so the case doesn't involve recovery_...
04/14/2022
- 07:58 AM Bug #52948: osd: fails to come up: "teuthology.misc:7 of 8 OSDs are up"
- Radoslaw Zarzynski wrote:
> Hello Sridhar! Is there anything new? Have we discussed it already maybe?
Hello Radek... - 07:58 AM Bug #51463: blocked requests while stopping/starting OSDs
- yes. This is fixed with this two tasks:
https://tracker.ceph.com/issues/53327
https://tracker.ceph.com/issues/53326
Also available in: Atom