Activity
From 12/13/2020 to 01/11/2021
01/11/2021
- 11:58 PM Bug #48789 (In Progress): qa/standalone/scrub/osd-scrub-snaps.sh: create_scenario: return 1
- 07:41 PM Bug #48789 (Triaged): qa/standalone/scrub/osd-scrub-snaps.sh: create_scenario: return 1
- related to https://github.com/ceph/ceph/pull/38651
- 10:54 PM Bug #48842: qa/standalone/osd/osd-recovery-prio.sh: TEST_recovery_pool_priority failed
This can easily be explained by a slow test machine. The 10 second sleep wasn't enough time to get recovery initia...- 09:35 PM Bug #48842: qa/standalone/osd/osd-recovery-prio.sh: TEST_recovery_pool_priority failed
- ...
- 07:39 PM Bug #48842 (Resolved): qa/standalone/osd/osd-recovery-prio.sh: TEST_recovery_pool_priority failed
- ...
- 09:57 PM Bug #48843 (Resolved): Get more parallel scrubs within osd_max_scrubs limits
When a reservation failure prevents a PG from scrubbing, other possible scrubbable PGs aren't tried.- 07:43 PM Bug #45441: rados: Health check failed: 1/3 mons down, quorum a,c (MON_DOWN)" in cluster log'
- rados/singleton-nomsgr/{all/recovery-unfound-found mon_election/connectivity rados supported-random-distro$/{centos_8...
- 07:32 PM Bug #48841 (Resolved): test_turn_off_module: wait_until_equal timed out
- ...
- 07:10 PM Bug #48840 (Closed): Octopus: Assert failure: test_ceph_osd_pool_create_utf8
- FAIL: test_rados.TestCommand.test_ceph_osd_pool_create_utf8
test_ceph_osd_pool_create_utf8 should also work if a c... - 06:56 PM Bug #48745: Segmentation fault in PrimaryLogPG::cancel_manifest_ops
- Myoungwon Oh wrote:
> Hm... I can't find any clues in /a/nojha-2021-01-07_00\:06\:49-rados-master-distro-basic-smith... - 06:09 PM Bug #38357: ClsLock.TestExclusiveEphemeralStealEphemeral failed
- /ceph/teuthology-archive/yuriw-2021-01-08_16:38:07-rados-wip-yuri4-testing-2021-01-07-1041-octopus-distro-basic-smith...
- 06:08 PM Bug #46323: thrash_cache_writeback_proxy_none: FAILED ceph_assert(version == old_value.version) i...
- -/ceph/teuthology-archive/yuriw-2021-01-08_16:38:07-rados-wip-yuri4-testing-2021-01-07-1041-octopus-distro-basic-smit...
- 04:06 AM Bug #46323: thrash_cache_writeback_proxy_none: FAILED ceph_assert(version == old_value.version) i...
- http://pulpito.front.sepia.ceph.com:80/gregf-2021-01-09_02:02:11-rados-wip-stretch-updates-108-2-distro-basic-smithi/...
- 06:05 PM Bug #48793 (Triaged): out of order op
- details in https://tracker.ceph.com/issues/48777#note-7
- 10:16 AM Bug #48793: out of order op
- http://qa-proxy.ceph.com/teuthology/ideepika-2020-12-18_14:27:53-rados:thrash-erasure-code-master-distro-basic-smith...
- 06:05 PM Bug #48485: osd thrasher timeout
- seems related,adding here will verify later
/ceph/teuthology-archive/yuriw-2021-01-08_16:38:07-rados-wip-yuri4-tes... - 04:50 PM Backport #48482 (Resolved): nautilus: PG::_delete_some isn't optimal iterating objects
- 04:43 PM Backport #48482: nautilus: PG::_delete_some isn't optimal iterating objects
- Igor Fedotov wrote:
> https://github.com/ceph/ceph/pull/38478
merged - 11:51 AM Bug #44595: cache tiering: Error: oid 48 copy_from 493 returned error code -2
- ...
- 10:52 AM Bug #48821: osd crash in OSD::heartbeat when dereferencing null session
- The fix seems just to check that the session pointer is not null before trying to use it. If the problem is not deepe...
- 10:48 AM Bug #48821 (Resolved): osd crash in OSD::heartbeat when dereferencing null session
- For an unhealthy (unstable) cluster with flip-flopping osds we observed crashes like this:...
- 08:00 AM Bug #48212: poollast_epoch_clean floor is stuck after pg merging
- We got a little relief by reducing mon_osdmap_full_prune_min from the default 10,000 to 1,000 but osdmaps still grew ...
- 03:57 AM Bug #48503: scrub stat mismatch on bytes
- http://pulpito.front.sepia.ceph.com:80/gregf-2021-01-09_02:02:11-rados-wip-stretch-updates-108-2-distro-basic-smithi/...
- 03:40 AM Bug #47719: api_watch_notify: LibRadosWatchNotify.AioWatchDelete2 fails
- http://pulpito.front.sepia.ceph.com:80/gregf-2021-01-09_02:02:11-rados-wip-stretch-updates-108-2-distro-basic-smithi/...
01/09/2021
- 10:55 PM Bug #46978: OSD: shutdown of a OSD Host causes slow requests
- Hi Manuel,
Would you be able to test a patch for this issue?
If so, what OS and ceph packages/version you run?
... - 04:47 PM Bug #48721: tcmalloc doesn't release memory
- Josh Durgin wrote:
> Those stats show the memory is mostly used by the mon or released by tcmalloc but the kernel ha... - 04:16 PM Bug #48212: poollast_epoch_clean floor is stuck after pg merging
- Reproduced on another merge cycle. Restarting only the leading mon, waiting 5 minutes and then creating a new epoch r...
- 01:13 PM Bug #48212: poollast_epoch_clean floor is stuck after pg merging
- Apologies, the leading monitor bit is miss leading. The osdmap data is immediately trimmed the moment the last monito...
- 12:56 PM Bug #48212: poollast_epoch_clean floor is stuck after pg merging
- We're running Ceph Octopus 15.2.8 with the same problem. Our monitors ran out of space after enabling autoscale as os...
- 06:53 AM Bug #48745: Segmentation fault in PrimaryLogPG::cancel_manifest_ops
- Hm... I can't find any clues in /a/nojha-2021-01-07_00\:06\:49-rados-master-distro-basic-smithi/5761073.
Can we repr...
01/08/2021
- 10:31 PM Bug #48721: tcmalloc doesn't release memory
- Those stats show the memory is mostly used by the mon or released by tcmalloc but the kernel hasn't reclaimed it.
... - 10:30 PM Bug #48775 (Duplicate): FAILED ceph_assert(is_primary()) in PG::scrub()
- 10:27 PM Bug #48732 (Need More Info): Marking OSDs out causes mon daemons to crash following tcmalloc: lar...
- It will be great if you can share a reproducer for this or reproduce this capture monitor logs with debugging enabled.
- 10:21 PM Bug #48790: rados/multimon: MON_DOWN in mon_election/connectivity with mon_clock_no_skews
- Greg, could you please take a look and see if my theory makes sense.
- 10:17 PM Bug #48745: Segmentation fault in PrimaryLogPG::cancel_manifest_ops
- Myoungwon Oh: I am assigning this to you for more inputs.
- 12:26 AM Bug #48745: Segmentation fault in PrimaryLogPG::cancel_manifest_ops
- Xie Xingguo/Myoungwon Oh: this seems to be new regression in master, do you know what could have caused it? I don't s...
- 12:11 AM Bug #48745: Segmentation fault in PrimaryLogPG::cancel_manifest_ops
- /a/nojha-2021-01-07_00:06:49-rados-master-distro-basic-smithi/5761073
- 07:07 PM Bug #48536 (Rejected): ceph tool: osd crush create-or-move cannot accept multiple crush buckets
- I'm dumb; this works just fine with the expected syntax:...
- 12:59 PM Bug #48789: qa/standalone/scrub/osd-scrub-snaps.sh: create_scenario: return 1
- I suspect this is related to the changes in column family in RocksDB.
(Maybe https://github.com/ceph/ceph/pull/38310... - 03:50 AM Bug #48789: qa/standalone/scrub/osd-scrub-snaps.sh: create_scenario: return 1
- /a//kchai-2021-01-07_03:01:15-rados-wip-kefu-testing-2021-01-05-2058-distro-basic-smithi/5761661
my branch include... - 12:02 AM Bug #48793 (Resolved): out of order op
- ...
01/07/2021
- 09:06 PM Bug #48789: qa/standalone/scrub/osd-scrub-snaps.sh: create_scenario: return 1
- /a/nojha-2021-01-07_00:06:49-rados-master-distro-basic-smithi/5761090
- 06:17 PM Bug #48789 (Resolved): qa/standalone/scrub/osd-scrub-snaps.sh: create_scenario: return 1
- ...
- 08:58 PM Bug #48775: FAILED ceph_assert(is_primary()) in PG::scrub()
- @Ronen: /a/nojha-2021-01-07_00:06:49-rados-master-distro-basic-smithi/5760959 has logs
- 08:56 AM Bug #48775: FAILED ceph_assert(is_primary()) in PG::scrub()
- Seems to be the same problem supposedly solved by https://github.com/ceph/ceph/pull/38730.
Verifying.
- 12:25 AM Bug #48775: FAILED ceph_assert(is_primary()) in PG::scrub()
- /a/teuthology-2021-01-05_07:01:02-rados-master-distro-basic-smithi/5755459
- 12:12 AM Bug #48775 (Duplicate): FAILED ceph_assert(is_primary()) in PG::scrub()
- ...
- 08:55 PM Bug #48790: rados/multimon: MON_DOWN in mon_election/connectivity with mon_clock_no_skews
- My first impression is that this is related to election_strategy connectivity.
All mons are in quorum here:
<pr... - 06:56 PM Bug #48790 (New): rados/multimon: MON_DOWN in mon_election/connectivity with mon_clock_no_skews
- rados/multimon/{clusters/9 mon_election/connectivity msgr-failures/many msgr/async-v1only no_pools objectstore/bluest...
- 06:05 PM Bug #45647: "ceph --cluster ceph --log-early osd last-stat-seq osd.0" times out due to msgr-failu...
- rados/thrash-erasure-code-shec/{ceph clusters/{fixed-4 openstack} mon_election/classic msgr-failures/few objectstore/...
- 04:03 PM Bug #48786 (Resolved): api_tier_pp: LibRadosTwoPoolsPP.ManifestSnapRefcount/ManifestSnapRefcount2...
- Run: https://pulpito.ceph.com/teuthology-2021-01-07_05:00:03-smoke-master-distro-basic-smithi/
Job: 5761861
Logs: h... - 12:20 PM Backport #48480 (Resolved): octopus: PG::_delete_some isn't optimal iterating objects
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/38477
m... - 12:19 PM Backport #48243 (Resolved): octopus: collection_list_legacy: pg inconsistent
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/38098
m...
01/06/2021
- 04:27 PM Backport #48480: octopus: PG::_delete_some isn't optimal iterating objects
- Igor Fedotov wrote:
> https://github.com/ceph/ceph/pull/38477
merged - 04:24 PM Backport #48243: octopus: collection_list_legacy: pg inconsistent
- Mykola Golub wrote:
> https://github.com/ceph/ceph/pull/38098
merged - 10:20 AM Bug #48764 (New): Octopus: Radosmodel tests fails when trying DeleteOp::_begin()
- ...
- 09:17 AM Bug #45615: api_watch_notify_pp: LibRadosWatchNotifyPPTests/LibRadosWatchNotifyPP.WatchNotify/1 f...
- /ceph/teuthology-archive/yuriw-2021-01-04_18:28:05-rados-wip-yuri2-testing-2021-01-04-0837-octopus-distro-basic-smith...
- 06:40 AM Bug #47024: rados/test.sh: api_tier_pp LibRadosTwoPoolsPP.ManifestSnapRefcount failed
- /a//kchai-2021-01-06_02:57:51-rados-wip-kefu-testing-2021-01-05-2058-distro-basic-smithi/5758216
01/05/2021
- 03:27 PM Bug #48750 (Resolved): ceph config set using osd/host mask not working
- this does not work, tested with 14.2.9 and 14.2.16:...
01/04/2021
- 07:43 PM Bug #48732: Marking OSDs out causes mon daemons to crash following tcmalloc: large alloc
- This seems related to https://bugzilla.redhat.com/show_bug.cgi?id=1826450 our circumstances are highly similar.
- 05:42 PM Bug #48745 (Resolved): Segmentation fault in PrimaryLogPG::cancel_manifest_ops
- ...
12/31/2020
- 06:08 PM Bug #48732 (Need More Info): Marking OSDs out causes mon daemons to crash following tcmalloc: lar...
- On a 14.2.11 zero-load cluster I am taking some osd servers out of service.
I began marking OSDs out in preparation ... - 11:31 AM Bug #46323: thrash_cache_writeback_proxy_none: FAILED ceph_assert(version == old_value.version) i...
- /a//kchai-2020-12-31_09:50:24-rados-wip-kefu-testing-2020-12-31-1427-distro-basic-smithi/5749187
12/30/2020
- 03:22 AM Bug #48323 (Resolved): "size 0 != clone_size 10" in clog - clone size mismatch when deduped objec...
12/29/2020
12/28/2020
- 11:54 AM Bug #48719 (Fix Under Review): ceph: qa/standalone/scrub/osd-recovery-scrub.sh: erroneous 'Recove...
- suggested PR: 38723
12/27/2020
- 08:52 PM Bug #48721 (New): tcmalloc doesn't release memory
- Ceph Monitor isn't releasing memory for about many days! Using nautilus 14.2.14....
- 11:03 AM Bug #48712 (In Progress): ceph_assert(is_primary()) in PG::scrub()
- Caused when a PGScrub message is queued by a primary, but only de-queued when after an interval change.
(Specifica... - 08:00 AM Bug #48719 (Triaged): ceph: qa/standalone/scrub/osd-recovery-scrub.sh: erroneous 'Recovery never ...
- Working on a fix that will search the log-files for the 'recovering' text,
instead of polling the 'pg stat'.
- 07:25 AM Bug #48719 (Fix Under Review): ceph: qa/standalone/scrub/osd-recovery-scrub.sh: erroneous 'Recove...
- The log shows:
@
osd-recovery-scrub.sh:182: TEST_recovery_scrub_2: echo 'Recovery never started'
@
Even though...
12/26/2020
- 01:25 PM Bug #48468: ceph-osd crash before being up again
- Tried with a systemd service instead of the docker container, same behaviour as well.
I've submitted the new crash-...
12/25/2020
- 06:12 AM Bug #21287: 1 PG down, OSD fails with "FAILED assert(i->prior_version == last || i->is_error())"
- Chang Liu wrote:
> see https://github.com/ceph/ceph/pull/16675
seems that patch was already applied since the "FA...
12/24/2020
- 02:40 PM Bug #48669: libec_isa.so with TEXTREL for ceph-v15.2.8 on arrch64
- duplicate #48681
- 09:16 AM Bug #48468: ceph-osd crash before being up again
- little update, I've tried with @v15.2.8.@ Sadly same behaviour.
- 06:10 AM Bug #48599 (Resolved): Segmentation fault in ~C_SetManifestRefCountDone()
- 05:17 AM Bug #47945 (Duplicate): scrubbing failure
- 04:09 AM Bug #47945: scrubbing failure
- /a//kchai-2020-12-23_05:37:18-rados-wip-kefu-testing-2020-12-23-1139-distro-basic-smithi/5732435/
- 04:18 AM Bug #48712: ceph_assert(is_primary()) in PG::scrub()
- Hi Ronen, do you mind taking a look?
- 04:18 AM Bug #48712 (Resolved): ceph_assert(is_primary()) in PG::scrub()
- ...
12/22/2020
- 08:34 PM Bug #48581: MON: global_init: error reading config file
- Oscar Segarra wrote:
> 2020-12-12 00:24:28 /opt/ceph-container/bin/entrypoint.sh: STAYALIVE: container will not die... - 08:00 PM Documentation #23777 (Resolved): doc: description of OSD_OUT_OF_ORDER_FULL problem
- I concur with Anthony, and have accordingly changed the status of this issue to RESOLVED.
12/21/2020
- 10:15 AM Bug #48540: _txc_add_transaction error (17) File exists not handled on operation
- I have not seen this behavior anymore, so I would suggest to archive this issue for reference until it happens again ...
- 03:40 AM Bug #48600 (Resolved): osd: valgrind: Invalid read of size 8
12/18/2020
- 02:29 PM Bug #48613: Reproduce https://tracker.ceph.com/issues/48417
- investigating using: https://github.com/ideepika/ceph/pull/new/wip-tracker-48613
tuethology command: ... - 09:58 AM Bug #47024: rados/test.sh: api_tier_pp LibRadosTwoPoolsPP.ManifestSnapRefcount failed
- ...
- 03:44 AM Bug #48669 (New): libec_isa.so with TEXTREL for ceph-v15.2.8 on arrch64
- Wrong library erasure-code/libec_isa.so after enable isa-l EC for aarch64 platform (commit 9091b7cc32fc0d031ab44dd264...
12/17/2020
- 11:31 PM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
- Deepika Upadhyay wrote:
> master with 5 default: https://pulpito.ceph.com/ideepika-2020-12-16_07:01:58-rados:monthra... - 07:13 AM Bug #48645 (New): Ceph-OSD octopus memory leak
- Hi everyone,
Our team operates a Ceph Cluster(ceph version 15.2.5 (2c93eff00150f0cc5f106a559557a58d3d7b6f1f) octop... - 06:45 AM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
- ...
12/16/2020
- 02:09 PM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
- master with 5 default: https://pulpito.ceph.com/ideepika-2020-12-16_07:01:58-rados:monthrash-master-distro-basic-smit...
12/15/2020
- 08:43 PM Bug #48599 (Fix Under Review): Segmentation fault in ~C_SetManifestRefCountDone()
- 01:37 AM Bug #48599: Segmentation fault in ~C_SetManifestRefCountDone()
- https://github.com/ceph/ceph/pull/38576
- 12:13 AM Bug #48599: Segmentation fault in ~C_SetManifestRefCountDone()
- Ok. I’ll take a look.
- 07:03 PM Bug #48613 (Resolved): Reproduce https://tracker.ceph.com/issues/48417
- Use the osd thrashing tests to reproduce https://tracker.ceph.com/issues/48417.
1. only applies to EC
2. aim is t... - 06:58 PM Bug #48611 (Resolved): osd: Delay sending info to new backfill peer resetting last_backfill until...
- This should be relatively harmless as any osd with lb=MIN wouldn’t have been sent any real IOs anyway. Might be analo...
- 06:54 PM Bug #48609 (Closed): osd/PGLog: don’t fast-forward can_rollback_to during merge_log if the log is...
- (a) Doing so is wrong specifically for intervals where we go peered but not active.
See PGLog.cc:456, we uncondition...
12/14/2020
- 10:19 PM Bug #48042 (Resolved): Log "ceph health detail" periodically in cluster log
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 10:11 PM Bug #48599: Segmentation fault in ~C_SetManifestRefCountDone()
- Probably related to https://github.com/ceph/ceph/pull/37546/commits/29d442b4c7e0be1fd9f765049d00c65a978fa373.
@Myoun... - 09:03 PM Bug #48599 (Resolved): Segmentation fault in ~C_SetManifestRefCountDone()
- ...
- 09:14 PM Bug #48600 (Fix Under Review): osd: valgrind: Invalid read of size 8
- 09:10 PM Bug #48600 (Resolved): osd: valgrind: Invalid read of size 8
- ...
- 08:47 PM Bug #47767 (Resolved): octopus: setting noscrub crashed osd process
- 06:15 PM Backport #48596 (Resolved): octopus: nautilus: qa/standalone/scrub/osd-scrub-test.sh: _scrub_abor...
- https://github.com/ceph/ceph/pull/40278
- 06:15 PM Backport #48595 (Resolved): nautilus: nautilus: qa/standalone/scrub/osd-scrub-test.sh: _scrub_abo...
- https://github.com/ceph/ceph/pull/39125
- 06:14 PM Bug #48566 (Pending Backport): nautilus: qa/standalone/scrub/osd-scrub-test.sh: _scrub_abort: re...
This is probably fixed by https://github.com/ceph/ceph/pull/38472 which doesn't have a tracker so this will be that...- 03:25 PM Feature #48590 (Rejected): Add ability to blocklist a cephx entity name, a set of entities by a l...
- Background:
The need for fencing in a kubernetes multicluster scenario is presented here: https://lists.ceph.io/hy... - 12:16 PM Backport #48228 (Resolved): octopus: Log "ceph health detail" periodically in cluster log
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/38345
m... - 03:12 AM Bug #48583 (In Progress): nautilus: Log files are created with rights root:root
- 01:28 AM Bug #48583 (Resolved): nautilus: Log files are created with rights root:root
- On 7148b7c3ae254fbc7796ae63d86ea681c68e0d88...
12/13/2020
- 01:12 PM Bug #48581 (New): MON: global_init: error reading config file
- 2020-12-12 00:24:28 /opt/ceph-container/bin/entrypoint.sh: STAYALIVE: container will not die if a command fails.,
2...
Also available in: Atom