Project

General

Profile

Activity

From 04/14/2022 to 05/13/2022

05/13/2022

10:49 PM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
I reproduced the symptoms of this bug locally by incrementing the notify count before an eq check. The extra incremen... Laura Flores
09:29 AM Backport #55067: octopus: osd_fast_shutdown_notify_mon option should be true by default
My test result:... jianwei zhang

05/12/2022

11:10 PM Backport #55633 (In Progress): octopus: ceph-osd takes all memory before oom on boot
https://github.com/ceph/ceph/pull/46253 Radoslaw Zarzynski
06:07 PM Backport #55633 (Rejected): octopus: ceph-osd takes all memory before oom on boot
Backport Bot
10:56 PM Backport #52077: octopus: api_tier_pp: [ FAILED ] LibRadosTwoPoolsPP.HitSetWrite
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/45320
merged
Yuri Weinstein
10:42 PM Backport #55631 (In Progress): pacific: ceph-osd takes all memory before oom on boot
https://github.com/ceph/ceph/pull/46252 Radoslaw Zarzynski
06:06 PM Backport #55631 (Resolved): pacific: ceph-osd takes all memory before oom on boot
Backport Bot
10:39 PM Backport #55632 (In Progress): quincy: ceph-osd takes all memory before oom on boot
https://github.com/ceph/ceph/pull/46251 Radoslaw Zarzynski
06:06 PM Backport #55632 (Resolved): quincy: ceph-osd takes all memory before oom on boot
Backport Bot
06:52 PM Bug #55559: osd-backfill-stats.sh fails in TEST_backfill_ec_prim_out
Hello Laura! Is there a thing that makes you think this isn't a duplicate of #47026? Radoslaw Zarzynski
06:48 PM Bug #49689: osd/PeeringState.cc: ceph_abort_msg("past_interval start interval mismatch") start
If more date is necessary, it might be worth no contact Richard Bateman who replicated something awfully similar to t... Radoslaw Zarzynski
06:29 PM Bug #55582: octopus: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails because `rados_w...
Yet another in in the family of Watch / Notify ENOENT -> ENOTCONN bugs. Radoslaw Zarzynski
06:28 PM Bug #44229: monclient: _check_auth_rotating possible clock skew, rotating keys expired way too early
... Radoslaw Zarzynski
06:25 PM Bug #44229 (New): monclient: _check_auth_rotating possible clock skew, rotating keys expired way ...
Perhaps this replicated in:
/home/teuthworker/archive/yuriw-2022-04-29_15:44:49-rados-wip-yuri5-testing-2022-04-28...
Radoslaw Zarzynski
06:28 PM Bug #49591: no active mgr (MGR_DOWN)" in cluster log
I can't find @Degraded data redundancy@ in the mgr's log but I can find messages about expired cephx keys:... Radoslaw Zarzynski
06:09 PM Bug #52993: upgrade:octopus-x Test: Upgrade test failed due to timeout of the "ceph pg dump" command
We haven't backported the fix for https://tracker.ceph.com/issues/51815 to Octopus (per Neha's explanation). Radoslaw Zarzynski
06:02 PM Bug #47299: Assertion in pg_missing_set: p->second.need <= v || p->second.is_delete()
Hello! A note from a bug scrub:
1. This issue looks like being caused by a particular data stored in OSD which
2....
Radoslaw Zarzynski
09:33 AM Bug #47299: Assertion in pg_missing_set: p->second.need <= v || p->second.is_delete()
... Tobias Urdin
05:53 PM Bug #48440 (Need More Info): log [ERR] : scrub mismatch
We would need to ensure the latest reoccurence is about the OSD scrub (we haven't seen too many mon scrubbing issues ... Radoslaw Zarzynski
05:45 PM Bug #53729 (Pending Backport): ceph-osd takes all memory before oom on boot
Neha Ojha
02:29 PM Backport #55624 (In Progress): quincy: Unable to format `ceph config dump` command output in yaml...
Laura Flores
02:26 PM Backport #55624 (Resolved): quincy: Unable to format `ceph config dump` command output in yaml us...
https://github.com/ceph/ceph/pull/46246 Laura Flores
02:25 PM Bug #53895 (Pending Backport): Unable to format `ceph config dump` command output in yaml using `...
Laura Flores
10:12 AM Backport #55067: octopus: osd_fast_shutdown_notify_mon option should be true by default
I think we should also use failure_pending queue like send_failures to avoid one osd sending target osd to mon multip... jianwei zhang
09:35 AM Backport #55067: octopus: osd_fast_shutdown_notify_mon option should be true by default
... jianwei zhang
06:08 AM Backport #55067: octopus: osd_fast_shutdown_notify_mon option should be true by default

When a node is actively shut down for operation and maintenance,
the osd/mon/mds process on it will automatically...
jianwei zhang
05:55 AM Backport #55067: octopus: osd_fast_shutdown_notify_mon option should be true by default
Nitzan Mordechai wrote:
> jianwei zhang wrote:
> > osd_fast_shutdown(true)
> > osd_fast_shutdown_notify_mon(false)...
jianwei zhang

05/11/2022

08:59 PM Backport #54568: octopus: mon/MonCommands.h: target_size_ratio range is incorrect
Kamoltat Sirivadhna wrote:
> https://github.com/ceph/ceph/pull/45398
merged
Yuri Weinstein
08:46 PM Backport #55012: octopus: librados: check latest osdmap on ENOENT in pool_reverse_lookup()
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/45587
merged
Yuri Weinstein
08:12 PM Backport #53550: octopus: [RFE] Provide warning when the 'require-osd-release' flag does not matc...
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/44260
merged
Yuri Weinstein
04:58 PM Backport #52078 (Resolved): pacific: api_tier_pp: [ FAILED ] LibRadosTwoPoolsPP.HitSetWrite
Laura Flores
04:58 PM Backport #55047 (Resolved): quincy: rados/test.sh hangs while running LibRadosTwoPoolsPP.Manifest...
Laura Flores
04:58 PM Backport #55439 (Resolved): quincy: FAILED ceph_assert due to issue manifest API to the original ...
Laura Flores
04:56 PM Backport #54468 (Resolved): octopus: Setting osd_pg_max_concurrent_snap_trims to 0 prematurely cl...
Laura Flores
04:15 PM Backport #54468: octopus: Setting osd_pg_max_concurrent_snap_trims to 0 prematurely clears the sn...
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/45324
merged
Yuri Weinstein
04:56 PM Backport #55074 (Resolved): octopus: osd: osd_fast_shutdown_notify_mon not quite right
Laura Flores
04:13 PM Backport #55074: octopus: osd: osd_fast_shutdown_notify_mon not quite right
Laura Flores wrote:
> https://github.com/ceph/ceph/pull/45655
merged
Yuri Weinstein
04:15 PM Bug #54592: partial recovery: CEPH_OSD_OP_OMAPRMKEYRANGE should mark omap dirty
https://github.com/ceph/ceph/pull/45593 merged Yuri Weinstein
03:53 PM Bug #52993: upgrade:octopus-x Test: Upgrade test failed due to timeout of the "ceph pg dump" command
Similar problem happened on a rados/singleton test for Octopus:
/a/yuriw-2022-04-26_20:58:55-rados-wip-yuri2-testi...
Laura Flores
05:22 AM Bug #48440: log [ERR] : scrub mismatch
/home/teuthworker/archive/yuriw-2022-04-29_15:44:49-rados-wip-yuri5-testing-2022-04-28-1007-distro-default-smithi/681... Nitzan Mordechai
05:20 AM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
/home/teuthworker/archive/yuriw-2022-04-29_15:44:49-rados-wip-yuri5-testing-2022-04-28-1007-distro-default-smithi/681... Nitzan Mordechai
05:18 AM Bug #49591: no active mgr (MGR_DOWN)" in cluster log
/home/teuthworker/archive/yuriw-2022-04-29_15:44:49-rados-wip-yuri5-testing-2022-04-28-1007-distro-default-smithi/681... Nitzan Mordechai

05/10/2022

12:38 PM Backport #53971 (In Progress): octopus: BufferList.rebuild_aligned_size_and_memory failure
https://github.com/ceph/ceph/pull/46216 Radoslaw Zarzynski
12:32 PM Backport #53972 (In Progress): pacific: BufferList.rebuild_aligned_size_and_memory failure
https://github.com/ceph/ceph/pull/46215 Radoslaw Zarzynski
04:38 AM Bug #53327: osd: osd_fast_shutdown_notify_mon not quite right and enable osd_fast_shutdown_notify...
they did: https://tracker.ceph.com/issues/55074 Nitzan Mordechai
01:59 AM Bug #53327: osd: osd_fast_shutdown_notify_mon not quite right and enable osd_fast_shutdown_notify...
octopus: osd/OSD: osd_fast_shutdown_notify_mon not quite right #45655
https://github.com/ceph/ceph/pull/45655/commit...
jianwei zhang
04:31 AM Backport #55067: octopus: osd_fast_shutdown_notify_mon option should be true by default
jianwei zhang wrote:
> osd_fast_shutdown(true)
> osd_fast_shutdown_notify_mon(false)
> osd_mon_shutdown_timeout(5...
Nitzan Mordechai
12:55 AM Backport #55067: octopus: osd_fast_shutdown_notify_mon option should be true by default
osd_fast_shutdown(true)
osd_fast_shutdown_notify_mon(false)
osd_mon_shutdown_timeout(5s) --> cannot send MOSDMar...
jianwei zhang
12:49 AM Backport #55067: octopus: osd_fast_shutdown_notify_mon option should be true by default
... jianwei zhang
12:44 AM Backport #55067: octopus: osd_fast_shutdown_notify_mon option should be true by default
mon.a/c has millions of osd_failure (immediate+timeout). There should be messages forwarded by mon.c. jianwei zhang
12:41 AM Backport #55067: octopus: osd_fast_shutdown_notify_mon option should be true by default
ceph version: v15.2.13
I found a problem with the mon election, which should be related to it.
Test steps when ...
jianwei zhang
01:02 AM Bug #53328 (Duplicate): osd_fast_shutdown_notify_mon option should be true by default
Neha Ojha

05/09/2022

04:47 PM Bug #55582 (New): octopus: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails because `r...
/a/lflores-2022-05-09_14:54:06-rados-wip-55077-octopus-distro-default-smithi/6828789... Laura Flores
04:10 PM Bug #48793: out of order op
@Neha @Ronen this Octopus failure looks a lot like this Tracker. Was the revised scrub code backported to Octopus, or... Laura Flores
03:37 PM Backport #55581 (Rejected): octopus: api_list: LibRadosList.EnumerateObjects and LibRadosList.Enu...
Backport Bot
03:35 PM Bug #52553: pybind: rados.RadosStateError raised when closed watch object goes out of scope after...
/a/lflores-2022-05-04_18:59:38-rados-wip-55077-octopus-distro-default-smithi/6821227... Laura Flores
03:31 PM Bug #48899 (Pending Backport): api_list: LibRadosList.EnumerateObjects and LibRadosList.Enumerate...
/a/lflores-2022-05-04_18:59:38-rados-wip-55077-octopus-distro-default-smithi/6820998... Laura Flores
11:43 AM Bug #54182: OSD_TOO_MANY_REPAIRS cannot be cleared in >=Octopus
I just observed this issue once more and forgot to drop the info that a restart of an OSD actually resets this counte... Christian Rohmann
10:08 AM Bug #50608: ceph_assert(is_primary()) in PrimaryLogPG::on_local_recover
(upgrade and restart OSDs is probably more accurate wording). If I upgrade node #2 and OSD on node #1 would die with ... Tobias Urdin
10:07 AM Bug #50608: ceph_assert(is_primary()) in PrimaryLogPG::on_local_recover
Always happens when you upgrade nodes, probably some timing issue with PGs going or flapping primary. I never have de... Tobias Urdin
10:06 AM Bug #50608: ceph_assert(is_primary()) in PrimaryLogPG::on_local_recover
'virtual void PrimaryLogPG::on_local_recover(const hobject_t&, const ObjectRecoveryInfo&, ObjectContextRef, bool, Obj... Tobias Urdin
07:32 AM Bug #55573: stretch mode: be more sane about changing different size/min_size
Realized my suggestion/formula in the mailing list wasn't good :)
This is what I intended originally:
- degraded ...
Eneko Lacunza

05/06/2022

04:46 PM Bug #55573 (New): stretch mode: be more sane about changing different size/min_size
From the mailing list:
I created 2 aditional pools each with a matching stretch rule:
- size=2/min=1 (not advised...
Greg Farnum
01:01 AM Bug #55549: OSDs crashing
After days of fighting this (it's on a production cluster) I finally gave up on the least important of the pools -- t... Richard Bateman

05/05/2022

04:23 PM Bug #47025: rados/test.sh: api_watch_notify_pp LibRadosWatchNotifyECPP.WatchNotify failed
This is from the 16.2.8 run.
/a/yuriw-2022-05-04_20:09:21-rados-pacific-distro-default-smithi/6821705...
Laura Flores
03:22 PM Backport #55439: quincy: FAILED ceph_assert due to issue manifest API to the original object
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/46061
merged
Yuri Weinstein
03:20 PM Backport #55047: quincy: rados/test.sh hangs while running LibRadosTwoPoolsPP.ManifestFlushDupCount
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/45624
merged
Yuri Weinstein
03:13 PM Bug #55559 (Duplicate): osd-backfill-stats.sh fails in TEST_backfill_ec_prim_out
/a/yuriw-2022-04-28_14:23:18-rados-wip-yuri-testing-2022-04-27-1456-quincy-distro-default-smithi/6811107... Laura Flores
01:37 PM Bug #53729: ceph-osd takes all memory before oom on boot
Can I kindly ask if there's an estimate when this will be fixed and backported? We have customers that have been in t... Ruben Kerkhof

05/04/2022

10:20 PM Bug #55549 (Resolved): OSDs crashing
My apologies if this is the wrong project; I'm so lost on this particular issue that I'm not even sure where to ask f... Richard Bateman
08:11 PM Bug #55407: quincy osd's fail to boot and crash
It seems I messed up everything... Let me startover.
I have a ceph cluster running since looooong time ago. Recent...
Gonzalo Aguilar Delgado
05:56 PM Bug #55407: quincy osd's fail to boot and crash
Gonzalo Aguilar Delgado wrote:
> It doesn't matter. This is just a side effect. I mean... The bug is not caused by t...
Neha Ojha
05:52 PM Bug #53789: CommandFailedError (rados/test_python.sh): "RADOS object not found" causes test_rados...
I think the lack of @-2@ (@ENOENT@) **might** be caused the errno normalization @Objecter@ has. Radoslaw Zarzynski
07:42 AM Bug #53789: CommandFailedError (rados/test_python.sh): "RADOS object not found" causes test_rados...
I hit another issue when we have socket failure injection active when running the tests. I think this is not only the... Nitzan Mordechai
05:43 PM Bug #52624: qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)"
To judge how severe the problem really is we need the information whether the stall is permanent (PG gets stuck and t... Radoslaw Zarzynski
01:35 PM Bug #52624: qa: "Health check failed: Reduced data availability: 1 pg peering (PG_AVAILABILITY)"
"These PG_AVAILBILITY warnings are frequently seen with snap-schedule teuthology jobs.":https://pulpito.ceph.com/mcha... Milind Changire
05:37 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
Just the record: we suspect the issue is related to the error injection in async-msgr. Some runs without them are sup... Radoslaw Zarzynski
11:30 AM Backport #55543 (Resolved): quincy: should use TCMalloc for better performance
https://github.com/ceph/ceph/pull/47927 Backport Bot
11:30 AM Backport #55542 (Rejected): octopus: should use TCMalloc for better performance
Backport Bot
11:30 AM Backport #55541 (Rejected): pacific: should use TCMalloc for better performance
https://github.com/ceph/ceph/pull/51282 Backport Bot
11:29 AM Bug #55519 (Pending Backport): should use TCMalloc for better performance
Kefu Chai
06:13 AM Documentation #46120 (Resolved): Improve ceph-objectstore-tool documentation
This issue has been resolved. The ceph-objectstore-tool documentation now exists, and there's even a good manpage.
...
Zac Dover

05/03/2022

07:48 PM Bug #54509: FAILED ceph_assert due to issue manifest API to the original object
Myoungwon Oh wrote:
> https://github.com/ceph/ceph/pull/46120
Thanks for looking into it and creating the backport!
Neha Ojha
02:18 AM Bug #54509: FAILED ceph_assert due to issue manifest API to the original object
https://github.com/ceph/ceph/pull/46120 Myoungwon Oh
01:40 AM Bug #54509: FAILED ceph_assert due to issue manifest API to the original object
I think this is the same issue as https://tracker.ceph.com/issues/50806.
This issue was already fixed, but not backp...
Myoungwon Oh
01:07 AM Bug #54509: FAILED ceph_assert due to issue manifest API to the original object
Sure. Myoungwon Oh
07:46 PM Backport #50893 (In Progress): pacific: osd/PrimaryLogPG.cc: FAILED ceph_assert(attrs || !recover...
Neha Ojha
06:26 PM Backport #55019: octopus: partial recovery: CEPH_OSD_OP_OMAPRMKEYRANGE should mark omap dirty
Christian Rohmann wrote:
> Sorry for being a nag ... I initially reported https://tracker.ceph.com/issues/53663 and ...
Vikhyat Umrao
04:41 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
玮文 胡 wrote:
> Maybe we should fix the release note(https://docs.ceph.com/en/latest/releases/quincy/) first? The work...
Vikhyat Umrao
04:32 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
玮文 胡 wrote:
> https://github.com/ceph/ceph/pull/46124
>
> Tested locally with
>
> [...]
Thank you.
Vikhyat Umrao
04:27 PM Bug #55383 (Fix Under Review): monitor cluster logs(ceph.log) appear empty until rotated
Vikhyat Umrao
09:55 AM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
Maybe we should fix the release note(https://docs.ceph.com/en/latest/releases/quincy/) first? The workaround there is... 玮文 胡
09:44 AM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
https://github.com/ceph/ceph/pull/46124
Tested locally with...
玮文 胡
03:49 PM Bug #51076: "wait_for_recovery: failed before timeout expired" during thrashosd test with EC back...
Ran 100 thrash-erasure-code-big tests in octopus, and the `wait_for_recovery` assertion occurred 18/100 times, with 1... Laura Flores
11:19 AM Bug #55407: quincy osd's fail to boot and crash
It doesn't matter. This is just a side effect. I mean... The bug is not caused by the tool.
The bug is caused becau...
Gonzalo Aguilar Delgado
07:23 AM Bug #55519 (Fix Under Review): should use TCMalloc for better performance
Kefu Chai
07:22 AM Bug #55519 (Resolved): should use TCMalloc for better performance
we had been using TCMalloc in older releases. but somehow, we stopped doing so. let's bring it back. Kefu Chai

05/02/2022

06:50 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
To me looks like this is the problem?
https://github.com/ceph/ceph/commit/7c84e06e6f846f6b4b6fd959218b4d474520f429...
Vikhyat Umrao
05:29 PM Bug #54509: FAILED ceph_assert due to issue manifest API to the original object
Myoungwon Oh: Seeing this in pacific as well, can you confirm if it is the same issue?
/a/yuriw-2022-04-30_17:01:...
Neha Ojha
02:12 PM Bug #53789 (In Progress): CommandFailedError (rados/test_python.sh): "RADOS object not found" cau...
Nitzan Mordechai
01:11 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
Scheduled another run with just the rados/verify test that failed and I can see this happen frequently:
/a/amathu...
Aishwarya Mathuria
01:07 PM Backport #55513 (In Progress): quincy: mount.ceph fails to understand AAAA records from SRV record
Matan Breizman
12:57 PM Backport #55513 (Resolved): quincy: mount.ceph fails to understand AAAA records from SRV record
https://github.com/ceph/ceph/pull/46113 Backport Bot
01:03 PM Backport #55514 (In Progress): pacific: mount.ceph fails to understand AAAA records from SRV record
Matan Breizman
12:57 PM Backport #55514 (Resolved): pacific: mount.ceph fails to understand AAAA records from SRV record
https://github.com/ceph/ceph/pull/46112 Backport Bot
12:52 PM Bug #47300 (Pending Backport): mount.ceph fails to understand AAAA records from SRV record
Matan Breizman
08:34 AM Bug #47300 (Resolved): mount.ceph fails to understand AAAA records from SRV record
Kefu Chai

05/01/2022

05:40 AM Bug #43887 (Fix Under Review): ceph_test_rados_delete_pools_parallel failure
https://github.com/ceph/ceph/pull/46099 Nitzan Mordechai

04/28/2022

09:29 PM Bug #55488 (New): ENOENT on clone on EC non-primary shard
... Neha Ojha
06:59 PM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
/a/yuriw-2022-04-27_02:52:22-rados-pacific-distro-default-smithi/6807766... Laura Flores

04/27/2022

09:51 PM Backport #55439 (In Progress): quincy: FAILED ceph_assert due to issue manifest API to the origin...
Laura Flores
09:25 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
Laura Flores wrote:
> This one looks somewhat different from the other reported failures. First of all, it failed on...
Laura Flores
05:37 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
Let's discuss this on the next RADOS Team Meeting. Radoslaw Zarzynski
06:07 PM Bug #55424 (Won't Fix): ceph-mon process exit in dead status , which backtrace displayed has bloc...
Sorry, the version is EOL :-(. Radoslaw Zarzynski
06:06 PM Bug #55419 (Resolved): cephtool/test.sh: failure on blocklist testing
Neha Ojha
05:58 PM Bug #55440: osd-scrub-test.sh: TEST_scrub_test failed due to inconsistent PG
... Neha Ojha
05:56 PM Bug #49754: osd/OSD.cc: ceph_abort_msg("abort() called") during OSD::shutdown()
Laura Flores wrote:
> /a/yuriw-2022-04-26_00:11:14-rados-wip-55324-pacific-backport-distro-default-smithi/6805265/re...
Neha Ojha
05:49 PM Bug #55407: quincy osd's fail to boot and crash
Hello Gonzalo!
Just a quick note from a bug srub: we don't support mixing the tool from a newer release with OSDs fr...
Radoslaw Zarzynski
05:39 PM Bug #51076: "wait_for_recovery: failed before timeout expired" during thrashosd test with EC back...
This was discussed in the rados meeting this week. Laura is trying to check if the bug exists in Octopus or not, to h... Neha Ojha
05:34 PM Bug #55433 (Closed): common: FAILED ceph_assert(((lock).is_locked()))
The fix will be merged with the original PR. Neha Ojha
10:24 AM Bug #47300 (Fix Under Review): mount.ceph fails to understand AAAA records from SRV record
Matan Breizman

04/26/2022

03:58 PM Bug #52657: MOSDPGLog::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_NAUTILUS)'
This one looks somewhat different from the other reported failures. First of all, it failed on a rados/verify test, n... Laura Flores
02:33 PM Bug #49754: osd/OSD.cc: ceph_abort_msg("abort() called") during OSD::shutdown()
/a/yuriw-2022-04-26_00:11:14-rados-wip-55324-pacific-backport-distro-default-smithi/6805265/remote/smithi061/crash/20... Laura Flores
10:27 AM Bug #55450 (Resolved): [DOC] stretch_rule defined in the doc needs updation
in section [1], the stretch_rule defined to be added to the crush map needs to be updated.
min size and max size par...
Pawan Dhiran

04/25/2022

10:11 PM Bug #55433: common: FAILED ceph_assert(((lock).is_locked()))
https://github.com/ceph/ceph/pull/46028 has been merged to unblock other master PR merges. Neha Ojha
06:28 PM Bug #55433 (Fix Under Review): common: FAILED ceph_assert(((lock).is_locked()))
Neha Ojha
05:36 PM Bug #55433 (Closed): common: FAILED ceph_assert(((lock).is_locked()))
Seen in jenkins make check tests, i.e. https://jenkins.ceph.com/job/ceph-pull-requests/94227/console... Laura Flores
09:59 PM Bug #55440 (New): osd-scrub-test.sh: TEST_scrub_test failed due to inconsistent PG
/a/yuriw-2022-04-22_13:56:48-rados-wip-yuri2-testing-2022-04-22-0500-distro-default-smithi/6800338... Laura Flores
07:31 PM Bug #44595: cache tiering: Error: oid 48 copy_from 493 returned error code -2
/a/yuriw-2022-04-25_14:14:44-rados-wip-yuri3-testing-2022-04-22-0534-quincy-distro-default-smithi/6805186... Laura Flores
07:07 PM Bug #51076: "wait_for_recovery: failed before timeout expired" during thrashosd test with EC back...
/a/yuriw-2022-04-22_21:06:04-rados-wip-yuri3-testing-2022-04-22-0534-quincy-distro-default-smithi/6802072 Laura Flores
07:00 PM Backport #55439 (Resolved): quincy: FAILED ceph_assert due to issue manifest API to the original ...
https://github.com/ceph/ceph/pull/46061 Backport Bot
06:59 PM Bug #54509 (Pending Backport): FAILED ceph_assert due to issue manifest API to the original object
Laura Flores
06:58 PM Bug #54509: FAILED ceph_assert due to issue manifest API to the original object
/a/yuriw-2022-04-22_21:06:04-rados-wip-yuri3-testing-2022-04-22-0534-quincy-distro-default-smithi/6802065/remote/smit... Laura Flores
06:56 PM Bug #55435: mon/Elector: notify_ranked_removed() does not properly erase dead_ping in the case of...
In an example scenario where we have 5 monitors:
rank_size = 5
mon.a (rank 0)
mon.b (rank 1)
mon.c (rank 2)
mo...
Kamoltat (Junior) Sirivadhna
06:29 PM Bug #55435 (Resolved): mon/Elector: notify_ranked_removed() does not properly erase dead_ping in ...
Kamoltat (Junior) Sirivadhna
05:54 PM Bug #55407: quincy osd's fail to boot and crash
Gonzalo Aguilar Delgado wrote:
> Neha Ojha wrote:
> > Did you see the same segmentation fault in quincy and pacific...
Gonzalo Aguilar Delgado
05:50 PM Bug #55407: quincy osd's fail to boot and crash
The situation is even worse. Any osd created with ceph version 17.1.0 (c675060073a05d40ef404d5921c81178a52af6e0) quin... Gonzalo Aguilar Delgado
06:23 AM Bug #55407: quincy osd's fail to boot and crash
I managed to reproduce...
I install an OSD with the pacific version. Then I let it run for a while (10 min or so)...
Gonzalo Aguilar Delgado
05:51 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
Looks like it is happening because of mon/LogMonitor changing it back to RADOS. Vikhyat Umrao
05:49 PM Bug #55383 (Triaged): monitor cluster logs(ceph.log) appear empty until rotated
Vikhyat Umrao
05:49 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
If you are okay can you please send a quick fix? Vikhyat Umrao
05:48 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
玮文 胡 wrote:
> I suspect this issue is due to https://github.com/ceph/ceph/commit/7c84e06e6f846f6b4b6fd959218b4d47452...
Vikhyat Umrao
05:19 PM Bug #55419 (Fix Under Review): cephtool/test.sh: failure on blocklist testing
Neha Ojha
03:23 PM Bug #54458: osd-scrub-snaps.sh: TEST_scrub_snaps failed due to malformed log message
Perhaps this has resurfaced?
/a/yuriw-2022-04-22_13:56:48-rados-wip-yuri2-testing-2022-04-22-0500-distro-default-s...
Laura Flores
09:42 AM Bug #55424 (Won't Fix): ceph-mon process exit in dead status , which backtrace displayed has bloc...
plz see abc.png
LevelDBstore::close
set thread quit flag, compact_queue_stop = true.
then send signal ...
Yong Wang

04/23/2022

09:59 AM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
I suspect this issue is due to https://github.com/ceph/ceph/commit/7c84e06e6f846f6b4b6fd959218b4d474520f429 and have ... 玮文 胡
12:04 AM Bug #55419 (In Progress): cephtool/test.sh: failure on blocklist testing
Greg Farnum

04/22/2022

10:00 PM Bug #52153: crash: int MonitorDBStore::apply_transaction(MonitorDBStore::TransactionRef): abort
I have also seen this crash on my monitor running 16.2.7. Scott Hubbard
09:24 PM Bug #55407: quincy osd's fail to boot and crash
I saw the stacktrace. This time v17.2.0. Latest... Gonzalo Aguilar Delgado
09:20 PM Bug #55407: quincy osd's fail to boot and crash
Ok. This is the situation:
1.- OSD built from scracth in pacific. (docker pull ceph/daemon:latest-pacific)(
2.- U...
Gonzalo Aguilar Delgado
08:47 PM Bug #55407: quincy osd's fail to boot and crash
Igor Fedotov wrote:
> >2022-04-22T13:34:42.419+0000 7fd5798ed080 -1 bluefs _replay 0x11000: stop: unrecognized op 12...
Gonzalo Aguilar Delgado
03:36 PM Bug #55407: quincy osd's fail to boot and crash
>2022-04-22T13:34:42.419+0000 7fd5798ed080 -1 bluefs _replay 0x11000: stop: unrecognized op 12
@Gonzalo, AFAIU you...
Igor Fedotov
01:38 PM Bug #55407: quincy osd's fail to boot and crash
Neha Ojha wrote:
> Did you see the same segmentation fault in quincy and pacific? Were you testing a custom build of...
Gonzalo Aguilar Delgado
09:21 PM Bug #55419 (Resolved): cephtool/test.sh: failure on blocklist testing
/a/yuriw-2022-04-22_13:56:48-rados-wip-yuri2-testing-2022-04-22-0500-distro-default-smithi/6800292... Laura Flores
07:52 PM Bug #24057 (Rejected): cbt fails to copy results to the archive dir
Neha Ojha
06:27 PM Bug #43189 (Resolved): pgs stuck in laggy state
Neha Ojha
06:27 PM Backport #43232 (Rejected): nautilus: pgs stuck in laggy state
Nautilus is EOL Neha Ojha
06:26 PM Bug #41385 (Resolved): osd/ReplicatedBackend.cc: 1349: FAILED ceph_assert(peer_missing.count(from...
Neha Ojha
06:26 PM Backport #41731 (Rejected): nautilus: osd/ReplicatedBackend.cc: 1349: FAILED ceph_assert(peer_mis...
Nautilus is EOL Neha Ojha
02:46 PM Backport #55405 (In Progress): quincy: librados C++ API requires C++17 to build
https://github.com/ceph/ceph/pull/46005 Radoslaw Zarzynski
02:41 PM Backport #55406 (In Progress): pacific: librados C++ API requires C++17 to build
https://github.com/ceph/ceph/pull/46004 Radoslaw Zarzynski

04/21/2022

11:11 PM Bug #55407 (Need More Info): quincy osd's fail to boot and crash
Did you see the same segmentation fault in quincy and pacific? Were you testing a custom build of ceph (17.1.0 is a d... Neha Ojha
08:20 PM Bug #55407 (Rejected): quincy osd's fail to boot and crash
I have a cluster with pacific. One of the osd started to crash...
So I zapped the disk and recreated again. I foun...
Gonzalo Aguilar Delgado
08:21 PM Bug #53729: ceph-osd takes all memory before oom on boot
Gonzalo Aguilar Delgado wrote:
> I suppose this thread can be closed as soon as the fix is in master. But just for r...
Gonzalo Aguilar Delgado
06:10 PM Bug #53729: ceph-osd takes all memory before oom on boot
I suppose this thread can be closed as soon as the fix is in master. But just for reference, in case has something to... Gonzalo Aguilar Delgado
05:01 PM Bug #53729: ceph-osd takes all memory before oom on boot
Mykola Golub wrote:
> Gonzalo Aguilar Delgado wrote:
>
> > Mykola, specially thank you for doing the patch.
>
...
Gonzalo Aguilar Delgado
04:39 PM Bug #53729: ceph-osd takes all memory before oom on boot
Gonzalo Aguilar Delgado wrote:
> Mykola, specially thank you for doing the patch.
I am not the author of the pa...
Mykola Golub
04:34 PM Bug #53729: ceph-osd takes all memory before oom on boot
Yesssss!!! Great job team!
It's up & running. It purged dups, booted the ceph-osd and only 1/2Gb RAM full booted. ...
Gonzalo Aguilar Delgado
04:23 PM Bug #53729: ceph-osd takes all memory before oom on boot
Mykola Golub wrote:
> Gonzalo Aguilar Delgado wrote:
>
> > CEPH_ARGS="--osd_pg_log_trim_max=10000 --osd_max_pg_lo...
Gonzalo Aguilar Delgado
07:30 AM Bug #53729: ceph-osd takes all memory before oom on boot
Gonzalo Aguilar Delgado wrote:
> CEPH_ARGS="--osd_pg_log_trim_max=10000 --osd_max_pg_log_entries=2000 " LD_LIBRARY...
Mykola Golub
07:01 AM Bug #53729: ceph-osd takes all memory before oom on boot
Nitzan Mordechai wrote:
> Gonzalo Aguilar Delgado wrote:
> > Nitzan Mordechai wrote:
> > > Can you please add the ...
Gonzalo Aguilar Delgado
07:28 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
- Just for completeness as expected this issue again happed today in gibba cluster during the log rotation window and... Vikhyat Umrao
05:14 PM Feature #54115 (In Progress): Log pglog entry size in OSD log if it exceeds certain size limit
Vikhyat Umrao
04:25 PM Backport #55406 (Rejected): pacific: librados C++ API requires C++17 to build
https://github.com/ceph/ceph/pull/46004 Backport Bot
04:25 PM Backport #55405 (In Progress): quincy: librados C++ API requires C++17 to build
Backport Bot
04:22 PM Bug #55233: librados C++ API requires C++17 to build
The c++ api was created only for internal use. It should not be held to such a guarantee. At least, that's what I u... Matt Benjamin
04:20 PM Bug #55233 (Pending Backport): librados C++ API requires C++17 to build
Radoslaw Zarzynski
03:31 PM Feature #53050 (Pending Backport): Support blocklisting a CIDR range
Greg Farnum
03:31 PM Feature #53050 (Resolved): Support blocklisting a CIDR range
Greg Farnum
12:21 PM Feature #55402 (New): rgw: Add dbstore & cloud-transition test-suites to teuthology
Add new test-suites to teuthology for below RGW features -
* cloud-transition
* dbstore backend
Soumya Koduri
03:40 AM Bug #55355: osd thread deadlock
Thanks for your reply @Radoslaw Zarzynski.
I checked the latest code and found that the code logic is the same. I th...
xueyong lu

04/20/2022

09:08 PM Bug #51076: "wait_for_recovery: failed before timeout expired" during thrashosd test with EC back...
Looking further into this issue, it looks like the bug occurs whenever "prep_object_replica_pushes()" is called, and ... Laura Flores
08:22 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
Neha noticed today that in the LRC cluster after having this workaround still, this cluster when went through a log r... Vikhyat Umrao
06:50 PM Bug #49231: MONs unresponsive over extended periods of time
Mimic is EOL :-(. Would you be able to upgrade soon? Radoslaw Zarzynski
06:43 PM Bug #55101 (New): mon has slow op
Radoslaw Zarzynski
06:33 PM Bug #55255 (Need More Info): "ceph iostat" exception!
Radoslaw Zarzynski
06:33 PM Bug #55255: "ceph iostat" exception!
Hello! How do you synchronize the clocks in your cluster? Is NTP properly running?
I'm asking about that to ensure...
Radoslaw Zarzynski
06:24 PM Bug #47025: rados/test.sh: api_watch_notify_pp LibRadosWatchNotifyECPP.WatchNotify failed
@Laura, @Nitzan: the assertion failure in the comment #7 is about @reply_map.size()@ while other occurrences mention ... Radoslaw Zarzynski
06:06 PM Bug #50608: ceph_assert(is_primary()) in PrimaryLogPG::on_local_recover
Hello! Looks it's reproducible which is good.
Would you be able to provide logs with extra debugs as mentioned in ht...
Radoslaw Zarzynski
06:02 PM Bug #55355: osd thread deadlock
Hello! @14.2.22@ is out-of-live actually. Would you be able to verify the issue on a newer one? Radoslaw Zarzynski
09:53 AM Bug #55355: osd thread deadlock
I find thread 45 wants to stop connection,but the connection has been stopped by thread 71
@
(gdb) f 5
#5 Async...
xueyong lu
05:57 PM Bug #51463 (Resolved): blocked requests while stopping/starting OSDs
Radoslaw Zarzynski
05:55 PM Bug #52948: osd: fails to come up: "teuthology.misc:7 of 8 OSDs are up"
Lowering the priority as the last replication is 5 months old. Radoslaw Zarzynski
05:51 PM Bug #53924: EC PG stuckrecovery_unfound+undersized+degraded+remapped+peered
Good to know and thanks for your testing!
Just to the record: leaving the bug in the @Need More Info@ state as the...
Radoslaw Zarzynski
04:37 PM Bug #53729: ceph-osd takes all memory before oom on boot
Gonzalo Aguilar Delgado wrote:
> Nitzan Mordechai wrote:
> > Can you please add the output of trim-pg-log ?
> > CE...
Nitzan Mordechai
04:23 PM Bug #53729: ceph-osd takes all memory before oom on boot
Mykola Golub wrote:
> Just as information that might be useful for someone. Although ceph-objectstore-tool is a more...
Gonzalo Aguilar Delgado
04:21 PM Bug #53729: ceph-osd takes all memory before oom on boot
Nitzan Mordechai wrote:
> Can you please add the output of trim-pg-log ?
> CEPH_ARGS="--osd_pg_log_trim_max=10000 -...
Gonzalo Aguilar Delgado
02:47 PM Bug #53729: ceph-osd takes all memory before oom on boot
Just as information that might be useful for someone. Although ceph-objectstore-tool is a more reliable way to confir... Mykola Golub
12:03 PM Bug #53729: ceph-osd takes all memory before oom on boot
Can you please add the output of trim-pg-log ?
CEPH_ARGS="--osd_pg_log_trim_max=10000 --osd_max_pg_log_entries=2000 ...
Nitzan Mordechai
11:06 AM Bug #53729: ceph-osd takes all memory before oom on boot
After a while it crashed...
-34> 2022-04-20T11:02:25.218+0000 7f72b3b83640 5 rocksdb: commit_cache_size High Pri...
Gonzalo Aguilar Delgado
10:50 AM Bug #53729: ceph-osd takes all memory before oom on boot
I've built the repo from git@github.com:NitzanMordhai/ceph.git branch origin/wip-nitzan-pglog-dups-not-trimmed
And...
Gonzalo Aguilar Delgado

04/19/2022

11:45 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
we found out that quincy has https://github.com/ceph/ceph/pull/40640 log_to_journald feature. When we set ... Vikhyat Umrao
05:48 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
Tim Wilkinson wrote:
> While executing tests on 17.1.0-203-g2c8d01fc, I see the ceph.log files on all MONs are zero ...
Vikhyat Umrao
04:56 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
Gibba cluster quincy version `17.1.0-163-g4e244311`. Vikhyat Umrao
04:52 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
I think we have not seen this issue before in previous quincy builds? something changed recently. Vikhyat Umrao
04:51 PM Bug #55383: monitor cluster logs(ceph.log) appear empty until rotated
Tim Wilkinson wrote:
> While executing tests on 17.1.0-203-g2c8d01fc, I see the ceph.log files on all MONs are zero ...
Vikhyat Umrao
04:43 PM Bug #55383 (Resolved): monitor cluster logs(ceph.log) appear empty until rotated
While executing tests on 17.1.0-203-g2c8d01fc, I see the ceph.log files on all MONs are zero length unless rotated.
...
Tim Wilkinson
08:13 AM Backport #55019: octopus: partial recovery: CEPH_OSD_OP_OMAPRMKEYRANGE should mark omap dirty
Sorry for being a nag ... I initially reported https://tracker.ceph.com/issues/53663 and still observe the issues of ... Christian Rohmann

04/18/2022

12:55 PM Bug #55355 (Resolved): osd thread deadlock
my ceph version is 14.2.22
After the network is abnormal, the osd cannot join the cluster.
Then I find the osd th...
xueyong lu

04/16/2022

08:18 AM Bug #50608: ceph_assert(is_primary()) in PrimaryLogPG::on_local_recover
2022-04-16T00:06:06.526+0200 7f6997402700 -1 osd.110 1166753 heartbeat_check: no reply from <censor>:6812 osd.109 sin... Tobias Urdin
08:15 AM Bug #50608: ceph_assert(is_primary()) in PrimaryLogPG::on_local_recover
There is a lot of lines like this before the crash line in the log file
-24> 2022-04-16T00:06:02.105+0200 7fede...
Tobias Urdin
08:14 AM Bug #50608: ceph_assert(is_primary()) in PrimaryLogPG::on_local_recover
2022-04-16T00:06:08.540+0200 7fedde264700 0 log_channel(cluster) log [WRN] : Monitor daemon marked osd.109 down, but... Tobias Urdin
08:09 AM Bug #50608: ceph_assert(is_primary()) in PrimaryLogPG::on_local_recover
OSD crashed with this again
{
"archived": "2022-04-15 23:07:21.580173",
"assert_condition": "is_primary(...
Tobias Urdin

04/15/2022

02:08 AM Bug #53924: EC PG stuckrecovery_unfound+undersized+degraded+remapped+peered
the pacth
osd/PeeringState: fix acting_set_writeable min_size check
can resolve ceph v15.2.13 recovery_unfoun...
jianwei zhang
02:07 AM Bug #53924: EC PG stuckrecovery_unfound+undersized+degraded+remapped+peered
jianwei zhang wrote:
> Radoslaw Zarzynski wrote:
> > > the all osds is up&in, so the case doesn't involve recovery_...
jianwei zhang

04/14/2022

07:58 AM Bug #52948: osd: fails to come up: "teuthology.misc:7 of 8 OSDs are up"
Radoslaw Zarzynski wrote:
> Hello Sridhar! Is there anything new? Have we discussed it already maybe?
Hello Radek...
Sridhar Seshasayee
07:58 AM Bug #51463: blocked requests while stopping/starting OSDs
yes. This is fixed with this two tasks:
https://tracker.ceph.com/issues/53327
https://tracker.ceph.com/issues/53326
Manuel Lausch
 

Also available in: Atom