Activity
From 05/28/2021 to 06/26/2021
06/26/2021
- 02:27 PM Backport #50986: pacific: unaligned access to member variables of crush_work_bucket
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/41983
merged - 02:26 PM Backport #50989: pacific: mon: slow ops due to osd_failure
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/41982
merged
06/25/2021
- 09:43 PM Bug #51101: rados/test_envlibrados_for_rocksdb.sh: cmake: symbol lookup error: cmake: undefined s...
- /a/yuriw-2021-06-24_16:54:31-rados-wip-yuri-testing-2021-06-24-0708-pacific-distro-basic-smithi/6190738
- 03:50 PM Backport #51371 (Resolved): pacific: OSD crash FAILED ceph_assert(!is_scrubbing())
- https://github.com/ceph/ceph/pull/41944
- 03:48 PM Bug #50346 (Pending Backport): OSD crash FAILED ceph_assert(!is_scrubbing())
- 06:48 AM Backport #50990 (Resolved): octopus: mon: slow ops due to osd_failure
06/24/2021
06/23/2021
- 10:43 PM Bug #50346: OSD crash FAILED ceph_assert(!is_scrubbing())
- Andrej Filipcic wrote:
> A related crash, happened when I disabled scrubbing:
>
> -1> 2021-06-14T11:17:15.373... - 10:42 PM Bug #51338 (Duplicate): osd/scrub_machine.cc: FAILED ceph_assert(state_cast<const NotActive*&g...
- Originally reported in https://tracker.ceph.com/issues/50346#note-6...
- 06:16 PM Backport #50796: octopus: mon: spawn loop after mon reinstalled
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/41621
merged - 03:31 PM Backport #50797: pacific: mon: spawn loop after mon reinstalled
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/41768
merged - 03:30 PM Bug #49988: Global Recovery Event never completes
- https://github.com/ceph/ceph/pull/41872 merged
06/22/2021
- 10:47 PM Backport #50986 (In Progress): pacific: unaligned access to member variables of crush_work_bucket
- 10:44 PM Backport #50989 (In Progress): pacific: mon: slow ops due to osd_failure
- 05:21 PM Backport #50705: octopus: _delete_some additional unexpected onode list
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/41623
merged - 05:21 PM Backport #50152: octopus: Reproduce https://tracker.ceph.com/issues/48417
- Dan van der Ster wrote:
> Nathan I've done the manual backport here: https://github.com/ceph/ceph/pull/41609
> Copy... - 11:23 AM Backport #51315 (In Progress): nautilus: osd:scrub skip some pg
- 10:55 AM Backport #51315 (Resolved): nautilus: osd:scrub skip some pg
- https://github.com/ceph/ceph/pull/41973
- 11:11 AM Backport #51314 (In Progress): octopus: osd:scrub skip some pg
- 10:55 AM Backport #51314 (Resolved): octopus: osd:scrub skip some pg
- https://github.com/ceph/ceph/pull/41972
- 10:56 AM Backport #51313 (In Progress): pacific: osd:scrub skip some pg
- 10:55 AM Backport #51313 (Resolved): pacific: osd:scrub skip some pg
- https://github.com/ceph/ceph/pull/41971
- 10:55 AM Backport #51316 (Duplicate): nautilus: osd:scrub skip some pg
- 10:53 AM Bug #49487 (Pending Backport): osd:scrub skip some pg
- 10:28 AM Bug #50346 (Fix Under Review): OSD crash FAILED ceph_assert(!is_scrubbing())
- 06:59 AM Bug #50346: OSD crash FAILED ceph_assert(!is_scrubbing())
- Andrej Filipcic wrote:
> On a 60-node, 1500 HDD cluster, and 16.2.4 release, this issue become very frequent, especi...
06/21/2021
- 08:50 PM Bug #50659: Segmentation fault under Pacific 16.2.1 when using a custom crush location hook
- FYI I tried with ceph/daemon-base:master-24e1f91-pacific-centos-8-x86_64 (the latest non-devel build at this time) ju...
- 06:07 PM Bug #51307 (Resolved): LibRadosWatchNotify.Watch2Delete fails
- ...
- 03:40 PM Bug #51270: mon: stretch mode clusters do not sanely set default crush rules
- Accidentally requested backports to Octopus/Nautilus, so nuking those.
- 03:40 PM Backport #51289 (Rejected): octopus: mon: stretch mode clusters do not sanely set default crush r...
- Accidental backport request
- 03:40 PM Backport #51288 (Rejected): nautilus: mon: stretch mode clusters do not sanely set default crush ...
- Accidental backport request
06/20/2021
06/19/2021
- 03:00 PM Backport #51290 (Resolved): pacific: mon: stretch mode clusters do not sanely set default crush r...
- https://github.com/ceph/ceph/pull/42909
- 03:00 PM Backport #51289 (Rejected): octopus: mon: stretch mode clusters do not sanely set default crush r...
- 03:00 PM Backport #51288 (Rejected): nautilus: mon: stretch mode clusters do not sanely set default crush ...
- 02:58 PM Bug #51270 (Pending Backport): mon: stretch mode clusters do not sanely set default crush rules
- 01:15 PM Backport #51287 (Resolved): pacific: LibRadosService.StatusFormat failed, Expected: (0) != (retry...
- https://github.com/ceph/ceph/pull/46677
- 01:12 PM Bug #51234 (Pending Backport): LibRadosService.StatusFormat failed, Expected: (0) != (retry), act...
- 01:11 PM Bug #51234 (Resolved): LibRadosService.StatusFormat failed, Expected: (0) != (retry), actual: 0 vs 0
- 02:19 AM Bug #51076: "wait_for_recovery: failed before timeout expired" during thrashosd test with EC back...
- I think this one is related?
/ceph/teuthology-archive/pdonnell-2021-06-16_21:26:55-fs-wip-pdonnell-testing-2021061...
06/18/2021
- 09:15 PM Bug #51083: Raw space filling up faster than used space
- I don't have any ideas from the logs. Moving this back to RADOS. I doubt it has anything to do with CephFS.
- 11:56 AM Bug #51083: Raw space filling up faster than used space
- Patrick Donnelly wrote:
> Scrub is unlikely to help.
I came to the same conclusion after reading the documentatio...
06/17/2021
- 11:07 PM Bug #51083: Raw space filling up faster than used space
- Jan-Philipp Litza wrote:
> Yesterday evening we finally managed to upgrade the MDS daemons as well, and that seems t... - 09:13 PM Bug #51083: Raw space filling up faster than used space
- Patrick: do you understand how upgrading the MDS daemons helped in this case? There is nothing in the osd/bluestore s...
- 09:03 PM Bug #51254: deep-scrub stat mismatch on last PG in pool
- We definitely do not use cache tiering on any of our clusters. On the cluster above, we do use snapshots (via cephfs...
- 08:48 PM Bug #51254: deep-scrub stat mismatch on last PG in pool
- It seems like you are using cache tiering, and there has been similar bugs reported like this. I don't understand why...
- 09:01 PM Bug #51234 (Fix Under Review): LibRadosService.StatusFormat failed, Expected: (0) != (retry), act...
- 08:57 PM Bug #50842 (Need More Info): pacific: recovery does not complete because of rw_manager lock not ...
- 08:53 PM Backport #51269 (In Progress): octopus: rados/perf: cosbench workloads hang forever
- 07:14 PM Backport #51269 (Resolved): octopus: rados/perf: cosbench workloads hang forever
- https://github.com/ceph/ceph/pull/41922
- 08:42 PM Bug #51074 (Pending Backport): standalone/osd-rep-recov-eio.sh: TEST_rep_read_unfound failed with...
- marking Pending Backport, needs to be included with https://github.com/ceph/ceph/pull/41731
- 08:40 PM Bug #51168 (Need More Info): ceph-osd state machine crash during peering process
- Can you please attach the osd log for this crash?
- 08:07 PM Bug #51270 (Fix Under Review): mon: stretch mode clusters do not sanely set default crush rules
- 08:03 PM Bug #51270 (Pending Backport): mon: stretch mode clusters do not sanely set default crush rules
- If you do not specify a crush rule when creating a pool, the OSDMonitor picks the default one for you out of the conf...
- 07:12 PM Bug #49139 (Pending Backport): rados/perf: cosbench workloads hang forever
- 02:32 PM Backport #51237: nautilus: rebuild-mondb hangs
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/41874
merged
06/16/2021
- 10:40 PM Bug #51254 (New): deep-scrub stat mismatch on last PG in pool
- In the past few weeks, we got inconsistent PGs in deep-scrub a few times, always on the very last PG in the pool:
... - 07:25 PM Bug #38357: ClsLock.TestExclusiveEphemeralStealEphemeral failed
- ...
- 07:22 PM Bug #45423: api_tier_pp: [ FAILED ] LibRadosTwoPoolsPP.HitSetWrite
- /ceph/teuthology-archive/yuriw-2021-06-14_19:20:57-rados-wip-yuri6-testing-2021-06-14-1106-octopus-distro-basic-smith...
- 06:48 PM Bug #50042: rados/test.sh: api_watch_notify failures
- ...
- 02:49 PM Bug #51246 (New): error in open_pools_parallel: rados_write(0.obj) failed with error: -2
- ...
- 01:22 PM Bug #50587: mon election storm following osd recreation: huge tcmalloc and ceph::msgr::v2::FrameA...
- > Will this patch be released in 14.2.22?
yes the PR has been merged to the nautilus branch, so it will be in the ... - 12:59 PM Bug #50587: mon election storm following osd recreation: huge tcmalloc and ceph::msgr::v2::FrameA...
- We hit this bug yesterday in a nautilus 14.2.18 cluster.
All monitors went down and started crashing on restart.
... - 02:18 AM Backport #51237 (In Progress): nautilus: rebuild-mondb hangs
- 02:16 AM Backport #51237 (Resolved): nautilus: rebuild-mondb hangs
- https://github.com/ceph/ceph/pull/41874
- 02:13 AM Bug #38219 (Pending Backport): rebuild-mondb hangs
06/15/2021
- 08:05 PM Bug #50462: OSDs crash in osd/osd_types.cc: FAILED ceph_assert(clone_overlap.count(clone))
- Just to note:
IMO ceph-bluestore-tool crash is caused by a bag in AvlAllocator and is a duplicate of https://tracker... - 06:59 PM Backport #51215 (In Progress): pacific: Global Recovery Event never completes
- 12:55 AM Backport #51215 (Resolved): pacific: Global Recovery Event never completes
- Backport PR https://github.com/ceph/ceph/pull/41872
- 06:50 PM Backport #50706: pacific: _delete_some additional unexpected onode list
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/41680
merged - 06:47 PM Bug #50842: pacific: recovery does not complete because of rw_manager lock not being released
- @Neha: I did, but am afraid they are lost, the test was from https://pulpito.ceph.com/ideepika-2021-05-17_10:16:28-ra...
- 06:31 PM Bug #50842: pacific: recovery does not complete because of rw_manager lock not being released
- @Deepika, do you happen to have the logs saved somewhere?
- 06:41 PM Bug #51234 (Pending Backport): LibRadosService.StatusFormat failed, Expected: (0) != (retry), act...
- ...
- 06:30 PM Bug #51076: "wait_for_recovery: failed before timeout expired" during thrashosd test with EC back...
- rados/thrash-erasure-code-big/{ceph cluster/{12-osds openstack} mon_election/connectivity msgr-failures/osd-dispatch-...
- 06:26 PM Bug #51076: "wait_for_recovery: failed before timeout expired" during thrashosd test with EC back...
- Looks very similar...
- 04:12 PM Backport #50750: octopus: max_misplaced was replaced by target_max_misplaced_ratio
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/41624
merged - 12:20 PM Bug #51223 (New): statfs: a cluster with filestore and bluestore OSD's will report bytes_used == ...
- Cluster migrated from Luminous mixed bluestore+filestore OSD's to Nautilus 14.2.21
After last filestore OSD purged f... - 10:47 AM Bug #49677 (Resolved): debian ceph-common package post-inst clobbers ownership of cephadm log dirs
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 10:47 AM Bug #49781 (Resolved): unittest_mempool.check_shard_select failed
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 10:44 AM Bug #50501 (Resolved): osd/scheduler/mClockScheduler: Async reservers are not updated with the ov...
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 10:44 AM Bug #50558 (Resolved): Data loss propagation after backfill
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 10:42 AM Backport #50795 (Resolved): nautilus: mon: spawn loop after mon reinstalled
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/41762
m... - 10:41 AM Backport #50704 (Resolved): nautilus: _delete_some additional unexpected onode list
- 10:36 AM Backport #50153 (Resolved): nautilus: Reproduce https://tracker.ceph.com/issues/48417
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/41611
m... - 10:36 AM Backport #49729 (Resolved): nautilus: debian ceph-common package post-inst clobbers ownership of ...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/40698
m... - 10:32 AM Backport #50988: nautilus: mon: slow ops due to osd_failure
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/41519
m... - 09:05 AM Backport #50406: pacific: mon: new monitors may direct MMonJoin to a peon instead of the leader
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/41131
m... - 09:04 AM Backport #50344: pacific: mon: stretch state is inconsistently-maintained on peons, preventing pr...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/41130
m... - 09:04 AM Backport #50794 (Resolved): pacific: osd: FAILED ceph_assert(recovering.count(*i)) after non-prim...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/41320
m... - 09:03 AM Backport #50702 (Resolved): pacific: Data loss propagation after backfill
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/41236
m... - 09:03 AM Backport #50606 (Resolved): pacific: osd/scheduler/mClockScheduler: Async reservers are not updat...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/41125
m... - 09:02 AM Backport #49992 (Resolved): pacific: unittest_mempool.check_shard_select failed
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/40566
m... - 12:54 AM Bug #49988 (Pending Backport): Global Recovery Event never completes
06/14/2021
- 03:24 PM Feature #51213 (Resolved): [ceph osd set noautoscale] Global on/off flag for PG autoscale feature
- For now, we do not have a global flag, like `ceph osd set noout` for the pg autoscale feature. We have pool flags[1] ...
- 09:24 AM Bug #50346: OSD crash FAILED ceph_assert(!is_scrubbing())
A related crash, happened when I disabled scrubbing:
-1> 2021-06-14T11:17:15.373+0200 7fb9916f5700 -1 /home/...
06/13/2021
- 03:34 PM Bug #51194: PG recovery_unfound after scrub repair failed on primary
- To prevent the user IO from being blocked, we took this action:
1. First, we queried the unfound objects. osd.951 ... - 01:36 PM Bug #51194 (New): PG recovery_unfound after scrub repair failed on primary
- This comes from a mail I send to the ceph-users ML: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/3...
- 03:30 PM Backport #51195 (Resolved): pacific: [rfe] increase osd_max_write_op_reply_len default value to 6...
- https://github.com/ceph/ceph/pull/53470
- 03:28 PM Bug #51166 (Pending Backport): [rfe] increase osd_max_write_op_reply_len default value to 64 bytes
06/12/2021
06/10/2021
- 08:35 PM Backport #51173 (Rejected): nautilus: regression in ceph daemonperf command output, osd columns a...
- 08:35 PM Backport #51172 (Resolved): pacific: regression in ceph daemonperf command output, osd columns ar...
- https://github.com/ceph/ceph/pull/44175
- 08:35 PM Backport #51171 (Resolved): octopus: regression in ceph daemonperf command output, osd columns ar...
- https://github.com/ceph/ceph/pull/44176
- 08:32 PM Bug #51002 (Pending Backport): regression in ceph daemonperf command output, osd columns aren't v...
- 06:48 PM Backport #50795: nautilus: mon: spawn loop after mon reinstalled
- Dan van der Ster wrote:
> https://github.com/ceph/ceph/pull/41762
merged - 03:58 PM Bug #51168 (New): ceph-osd state machine crash during peering process
- ...
- 02:22 PM Bug #51166 (Fix Under Review): [rfe] increase osd_max_write_op_reply_len default value to 64 bytes
- 02:16 PM Bug #51166 (Resolved): [rfe] increase osd_max_write_op_reply_len default value to 64 bytes
- As agreed in #ceph-devel, Sage, Josh, Neha concurring.
- 01:53 PM Bug #51076: "wait_for_recovery: failed before timeout expired" during thrashosd test with EC back...
- For the dead jobs, relevant logs have been uploaded to senta02 under /home/sseshasa/recovery_timeout.
Please let me ... - 10:25 AM Bug #50346: OSD crash FAILED ceph_assert(!is_scrubbing())
- On a 60-node, 1500 HDD cluster, and 16.2.4 release, this issue become very frequent, especially when RBD writes excee...
06/09/2021
- 01:32 PM Bug #51076: "wait_for_recovery: failed before timeout expired" during thrashosd test with EC back...
- From the logs of 6161181 snapshot recovery is not able to proceed since a rwlock on the head version
(3:cb63772d:::... - 08:07 AM Bug #51076: "wait_for_recovery: failed before timeout expired" during thrashosd test with EC back...
- Ran the same test repeatedly (5 times) on master by setting osd_op_queue to 'wpq' and 'mclock_scheduler' on different...
- 10:39 AM Bug #51074: standalone/osd-rep-recov-eio.sh: TEST_rep_read_unfound failed with "Bad data after pr...
- Kefu, yes I did read the update and your effort to find the commit(s) that caused the regression in the standalone te...
- 10:25 AM Bug #51074: standalone/osd-rep-recov-eio.sh: TEST_rep_read_unfound failed with "Bad data after pr...
- Sridhar, please read the https://tracker.ceph.com/issues/51074#note-3. that's my finding in the last 3 days.
- 09:44 AM Bug #51074: standalone/osd-rep-recov-eio.sh: TEST_rep_read_unfound failed with "Bad data after pr...
- Raised PR https://github.com/ceph/ceph/pull/41782 to address the test failure.
Please see latest update to https:/... - 09:05 AM Backport #51151 (Rejected): nautilus: When read failed, ret can not take as data len, in FillInVe...
- 09:05 AM Backport #51150 (Resolved): pacific: When read failed, ret can not take as data len, in FillInVer...
- https://github.com/ceph/ceph/pull/44173
- 09:05 AM Backport #51149 (Resolved): octopus: When read failed, ret can not take as data len, in FillInVer...
- https://github.com/ceph/ceph/pull/44174
- 09:02 AM Bug #51115 (Pending Backport): When read failed, ret can not take as data len, in FillInVerifyExtent
06/08/2021
- 11:10 PM Bug #38219: rebuild-mondb hangs
- http://qa-proxy.ceph.com/teuthology/yuriw-2021-06-08_20:53:36-rados-wip-yuri-testing-2021-06-04-0753-nautilus-distro-...
- 06:39 PM Bug #38219: rebuild-mondb hangs
- 2021-06-04T23:05:38.775 INFO:tasks.ceph.mon.a.smithi071.stderr:/build/ceph-14.2.21-305-gac8fcfa6/src/mon/OSDMonitor.c...
- 07:52 PM Backport #50797 (In Progress): pacific: mon: spawn loop after mon reinstalled
- 05:58 PM Backport #50795: nautilus: mon: spawn loop after mon reinstalled
- https://github.com/ceph/ceph/pull/41762
- 04:50 PM Bug #50681: memstore: apparent memory leak when removing objects
- Sven Anderson wrote:
> Greg Farnum wrote:
> > How long did you wait to see if memory usage dropped? Did you look at... - 08:48 AM Bug #51074 (Triaged): standalone/osd-rep-recov-eio.sh: TEST_rep_read_unfound failed with "Bad dat...
06/07/2021
- 11:37 AM Backport #51117 (In Progress): pacific: osd: Run osd bench test to override default max osd capac...
- 10:25 AM Backport #51117 (Resolved): pacific: osd: Run osd bench test to override default max osd capacity...
- https://github.com/ceph/ceph/pull/41731
- 10:22 AM Fix #51116 (Resolved): osd: Run osd bench test to override default max osd capacity for mclock.
- 08:28 AM Bug #51115: When read failed, ret can not take as data len, in FillInVerifyExtent
- https://github.com/ceph/ceph/pull/41727
- 08:12 AM Bug #51115 (Fix Under Review): When read failed, ret can not take as data len, in FillInVerifyExtent
- 07:42 AM Bug #51115 (Resolved): When read failed, ret can not take as data len, in FillInVerifyExtent
- when read failed, such as return -EIO, FillInVerifyExtent take ret as data length.
- 06:59 AM Bug #51083: Raw space filling up faster than used space
- Yesterday evening we finally managed to upgrade the MDS daemons as well, and that seems to have stopped the space was...
06/06/2021
- 11:22 AM Feature #51110 (New): invalidate crc in buffer::ptr::c_str()
- h3. what:
*buffer::ptr* (or more precisely, *buffer::raw*) has the ability to cache CRC codes that are calculated ...
06/05/2021
- 04:14 PM Bug #51074: standalone/osd-rep-recov-eio.sh: TEST_rep_read_unfound failed with "Bad data after pr...
- not able to reproduce this issue locally. bisecting:
|0331281e8a74d0b744cdcede1db24e7fea4656fc | https://pulpito.c... - 04:07 PM Bug #51074: standalone/osd-rep-recov-eio.sh: TEST_rep_read_unfound failed with "Bad data after pr...
- /a/kchai-2021-06-05_13:57:48-rados-master-distro-basic-smithi/6154221/
- 05:05 AM Bug #50441 (Pending Backport): cephadm bootstrap on arm64 fails to start ceph/ceph-grafana service
06/04/2021
- 11:01 PM Bug #51030 (Fix Under Review): osd crush during writing to EC pool when enabling jaeger tracing
- 10:44 PM Bug #50943 (Closed): mon crash due to assert failed
- Luminous is EOL, can you please redeploy the monitor and upgrade to a supported version of Ceph. Please reopen this t...
- 09:56 PM Bug #50308 (Resolved): mon: stretch state is inconsistently-maintained on peons, preventing prope...
- 09:55 PM Backport #50344 (Resolved): pacific: mon: stretch state is inconsistently-maintained on peons, pr...
- 09:55 PM Bug #50345 (Resolved): mon: new monitors may direct MMonJoin to a peon instead of the leader
- 09:54 PM Backport #50406 (Resolved): pacific: mon: new monitors may direct MMonJoin to a peon instead of t...
- https://github.com/ceph/ceph/pull/41131
- 09:41 PM Bug #36304: FAILED ceph_assert(p != pg_slots.end()) in OSDShard::register_and_wake_split_child(PG*)
- https://pulpito.ceph.com/gregf-2021-06-03_20:03:04-rados-pacific-mmonjoin-leader-testing-distro-basic-smithi/6150351/
- 08:56 PM Bug #50853: libcephsqlite: Core dump while running test_libcephsqlite.sh.
- Also a little more information about what the test is doing: this stage is testing that libcephsqlite kills all I/O i...
- 08:41 PM Bug #50853 (Need More Info): libcephsqlite: Core dump while running test_libcephsqlite.sh.
- So, unfortunately I've been unable to get the correct debugging symbols for the core file so I haven't been able to g...
- 07:13 PM Bug #51101 (Resolved): rados/test_envlibrados_for_rocksdb.sh: cmake: symbol lookup error: cmake: ...
- ...
- 06:27 PM Bug #39150: mon: "FAILED ceph_assert(session_map.sessions.empty())" when out of quorum
- /a/yuriw-2021-06-02_18:33:05-rados-wip-yuri3-testing-2021-06-02-0826-pacific-distro-basic-smithi/6147408
- 06:25 PM Bug #48997: rados/singleton/all/recovery-preemption: defer backfill|defer recovery not found in logs
- /a/yuriw-2021-06-02_18:33:05-rados-wip-yuri3-testing-2021-06-02-0826-pacific-distro-basic-smithi/6147404
- 06:24 PM Bug #49689: osd/PeeringState.cc: ceph_abort_msg("past_interval start interval mismatch") start
- /a/yuriw-2021-06-02_18:33:05-rados-wip-yuri3-testing-2021-06-02-0826-pacific-distro-basic-smithi/6147462 - with logs!
- 04:08 PM Bug #47440: nautilus: valgrind caught leak in Messenger::ms_deliver_verify_authorizer
- ...
- 06:46 AM Bug #50775: mds and osd unable to obtain rotating service keys
- Ilya Dryomov wrote:
> Yes, "debug paxos = 30" would definitely help! Sorry, I missed it because the previous set of... - 02:08 AM Bug #50813 (Duplicate): mon/OSDMonitor: should clear new flag when do destroy
- the issue had fixed by https://github.com/ceph/ceph/commit/13393f6108a89973e0415caa61c6025c760a3930
06/03/2021
- 09:49 PM Bug #50775: mds and osd unable to obtain rotating service keys
- Yes, "debug paxos = 30" would definitely help! Sorry, I missed it because the previous set of logs that you shared h...
- 06:12 AM Bug #50775: mds and osd unable to obtain rotating service keys
- Ilya Dryomov wrote:
> These logs are still weird. Now there is plenty of update_from_paxos log messages but virtual... - 07:48 PM Bug #51083 (Need More Info): Raw space filling up faster than used space
- We're seeing something strange currently. Our cluster is filling up faster than it should, and I assume it has someth...
- 07:48 PM Backport #50153: nautilus: Reproduce https://tracker.ceph.com/issues/48417
- Dan van der Ster wrote:
> Nautilus still has the buggy code in PG.cc (it was factored out to PeeringState.cc in octo... - 07:28 PM Backport #49729: nautilus: debian ceph-common package post-inst clobbers ownership of cephadm log...
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/40698
merged - 05:52 PM Backport #50704 (In Progress): nautilus: _delete_some additional unexpected onode list
- 04:11 PM Backport #50706 (In Progress): pacific: _delete_some additional unexpected onode list
- 10:34 AM Bug #51076 (Resolved): "wait_for_recovery: failed before timeout expired" during thrashosd test w...
- /a/sseshasa-2021-06-01_08:27:04-rados-wip-sseshasa-testing-objs-test-2-distro-basic-smithi/6145021
Unfortunately t... - 09:17 AM Bug #46847: Loss of placement information on OSD reboot
- I had a look at the reproducer and am not entirely sure if it is equivalent to the problem discussed here. it might b...
- 09:07 AM Bug #51074 (Resolved): standalone/osd-rep-recov-eio.sh: TEST_rep_read_unfound failed with "Bad da...
- Observed on Master:
/a/sseshasa-2021-06-01_08:27:04-rados-wip-sseshasa-testing-objs-test-2-distro-basic-smithi/61450... - 12:54 AM Bug #47654 (Resolved): test_mon_pg: mon fails to join quorum to due election strategy mismatch
- 12:54 AM Backport #50087 (Resolved): pacific: test_mon_pg: mon fails to join quorum to due election strate...
06/02/2021
- 08:40 PM Backport #50794: pacific: osd: FAILED ceph_assert(recovering.count(*i)) after non-primary osd res...
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/41320
merged - 06:54 PM Backport #50702: pacific: Data loss propagation after backfill
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/41236
merged - 06:53 PM Backport #50606: pacific: osd/scheduler/mClockScheduler: Async reservers are not updated with the...
- Sridhar Seshasayee wrote:
> https://github.com/ceph/ceph/pull/41125
merged - 06:51 PM Backport #49992: pacific: unittest_mempool.check_shard_select failed
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/40566
merged - 06:46 PM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
- /a/yuriw-2021-05-25_19:21:19-rados-wip-yuri2-testing-2021-05-25-0940-pacific-distro-basic-smithi/6134490
- 06:38 PM Bug #50042: rados/test.sh: api_watch_notify failures
- /a/yuriw-2021-05-25_19:21:19-rados-wip-yuri2-testing-2021-05-25-0940-pacific-distro-basic-smithi/6134471
- 11:46 AM Bug #50903 (Closed): ceph_objectstore_tool: Slow ops reported during the test.
- Closing this since the issue was hit during teuthology testing of my PR: https://github.com/ceph/ceph/pull/41308. Thi...
- 06:57 AM Bug #50806: osd/PrimaryLogPG.cc: FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_mis...
- Observed on master:
/a/sseshasa-2021-06-01_08:27:04-rados-wip-sseshasa-testing-objs-test-2-distro-basic-smithi/61450... - 06:54 AM Bug #50192: FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing().is_missing(soi...
- Observed on master:
/a/sseshasa-2021-06-01_08:27:04-rados-wip-sseshasa-testing-objs-test-2-distro-basic-smithi/61450... - 04:14 AM Bug #49962 (Resolved): 'sudo ceph --cluster ceph osd crush tunables default' fails due to valgrin...
- Thanks Radoslaw!
- 03:47 AM Bug #50853 (In Progress): libcephsqlite: Core dump while running test_libcephsqlite.sh.
06/01/2021
- 05:37 PM Bug #50853: libcephsqlite: Core dump while running test_libcephsqlite.sh.
- /a/sage-2021-05-29_16:04:00-rados-wip-sage3-testing-2021-05-29-1009-distro-basic-smithi/6142109
- 05:24 PM Bug #50743: *: crash in pthread_getname_np
- ...
- 11:51 AM Backport #50750 (In Progress): octopus: max_misplaced was replaced by target_max_misplaced_ratio
- 11:50 AM Backport #50705 (In Progress): octopus: _delete_some additional unexpected onode list
- 11:49 AM Backport #50987 (In Progress): octopus: unaligned access to member variables of crush_work_bucket
- 11:48 AM Backport #50796 (In Progress): octopus: mon: spawn loop after mon reinstalled
- 11:47 AM Backport #50790 (In Progress): octopus: osd: write_trunc omitted to clear data digest
- 11:41 AM Backport #50990 (In Progress): octopus: mon: slow ops due to osd_failure
- 09:43 AM Bug #51024: OSD - FAILED ceph_assert(clone_size.count(clone), keeps on restarting after one host ...
- > I set the cluster into "maintenance mode", noout, norebalance, nobackfill, norecover. And then proceeded to reboot ...
- 09:24 AM Bug #51024: OSD - FAILED ceph_assert(clone_size.count(clone), keeps on restarting after one host ...
- Could be related to https://github.com/ceph/ceph/pull/40572
- 07:57 AM Bug #51024: OSD - FAILED ceph_assert(clone_size.count(clone), keeps on restarting after one host ...
- https://tracker.ceph.com/issues/48060 is the same
- 09:19 AM Backport #50153: nautilus: Reproduce https://tracker.ceph.com/issues/48417
- Nautilus still has the buggy code in PG.cc (it was factored out to PeeringState.cc in octopus and newer).
I backpo... - 08:54 AM Backport #50152 (In Progress): octopus: Reproduce https://tracker.ceph.com/issues/48417
- Nathan I've done the manual backport here: https://github.com/ceph/ceph/pull/41609
Copy it to something with backpor... - 07:58 AM Bug #48060: data loss in EC pool
- I have had exactly the same issue with my cluster - https://tracker.ceph.com/issues/51024 while not even having any d...
- 06:03 AM Bug #51030: osd crush during writing to EC pool when enabling jaeger tracing
- PR: https://github.com/ceph/ceph/pull/41604
- 05:48 AM Bug #51030 (Fix Under Review): osd crush during writing to EC pool when enabling jaeger tracing
- On cent8(x86_64)
1. compiled with -DWITH_JAEGER=ON
2. starts vstart cluster
3. write to ec pool (i.e. rados benc...
05/31/2021
- 02:46 PM Bug #50903: ceph_objectstore_tool: Slow ops reported during the test.
- This issue is related to the changes currently under review: https://github.com/ceph/ceph/pull/41308
The ceph_obje... - 02:01 PM Bug #50688 (Duplicate): Ceph can't be deployed using cephadm on nodes with /32 ip addresses
- 02:01 PM Bug #50688: Ceph can't be deployed using cephadm on nodes with /32 ip addresses
- should have been fixed by https://github.com/ceph/ceph/pull/40961
- 07:56 AM Bug #51024: OSD - FAILED ceph_assert(clone_size.count(clone), keeps on restarting after one host ...
- I forgot to add.
I pulled v15.2.12 on the affected host, and also try running the OSD in that version. It didn't m... - 07:55 AM Bug #51024 (New): OSD - FAILED ceph_assert(clone_size.count(clone), keeps on restarting after one...
- Good day
I'm currently experiencing the same issue as with this gentleman: https://www.mail-archive.com/ceph-users...
05/29/2021
- 08:04 AM Bug #50775: mds and osd unable to obtain rotating service keys
- These logs are still weird. Now there is plenty of update_from_paxos log messages but virtually no paxosservice log ...
- 02:25 AM Bug #45423: api_tier_pp: [ FAILED ] LibRadosTwoPoolsPP.HitSetWrite
- /a/kchai-2021-05-28_13:33:45-rados-wip-kefu-testing-2021-05-28-1806-distro-basic-smithi/6140866
Also available in: Atom