Activity
From 12/20/2021 to 01/18/2022
01/18/2022
- 09:20 PM Bug #53924: EC PG stuckrecovery_unfound+undersized+degraded+remapped+peered
- Ceph OSD 33 Logs with grep unfound!
- 09:14 PM Bug #53924: EC PG stuckrecovery_unfound+undersized+degraded+remapped+peered
- Ceph PG query!
- 09:11 PM Bug #53924 (Need More Info): EC PG stuckrecovery_unfound+undersized+degraded+remapped+peered
- ...
- 08:36 PM Bug #53923 (Resolved): [Upgrade] mgr FAILED to decode MSG_PGSTATS
- ...
- 05:42 PM Bug #51076: "wait_for_recovery: failed before timeout expired" during thrashosd test with EC back...
- /a/yuriw-2022-01-15_05:47:18-rados-wip-yuri8-testing-2022-01-14-1551-distro-default-smithi/6619577
/a/yuriw-2022-01-... - 04:23 PM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
- /a/yuriw-2022-01-14_23:22:09-rados-wip-yuri6-testing-2022-01-14-1207-distro-default-smithi/6617813
- 08:26 AM Bug #53910 (Closed): client: client session state stuck in opening and hang all the time
01/16/2022
- 08:40 PM Bug #53729: ceph-osd takes all memory before oom on boot
- Do you need something else to find a workaround or the full solution?
Is there anything I can do?
01/14/2022
- 11:21 PM Bug #53895 (Resolved): Unable to format `ceph config dump` command output in yaml using `-f yaml`
- https://bugzilla.redhat.com/show_bug.cgi?id=2040709
- 10:45 AM Bug #43266 (Resolved): common: admin socket compiler warning
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 07:56 AM Bug #43887: ceph_test_rados_delete_pools_parallel failure
- /a/yuriw-2022-01-13_18:06:52-rados-wip-yuri3-testing-2022-01-13-0809-distro-default-smithi/6614510
- 12:26 AM Backport #53877 (In Progress): octopus: pgs wait for read lease after osd start
- 12:12 AM Backport #53876 (In Progress): pacific: pgs wait for read lease after osd start
01/13/2022
- 11:15 PM Backport #53877 (Resolved): octopus: pgs wait for read lease after osd start
- https://github.com/ceph/ceph/pull/44585
- 11:15 PM Backport #53876 (Resolved): pacific: pgs wait for read lease after osd start
- https://github.com/ceph/ceph/pull/44584
- 11:11 PM Bug #53326 (Pending Backport): pgs wait for read lease after osd start
- 10:54 PM Bug #53729: ceph-osd takes all memory before oom on boot
- Neha Ojha wrote:
> Gonzalo Aguilar Delgado wrote:
> > Neha Ojha wrote:
> > > Like the other case reported in the m... - 10:52 PM Bug #53729: ceph-osd takes all memory before oom on boot
- Igor Fedotov wrote:
> One more case:
> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/FQXV452YLHBJ... - 12:23 PM Bug #53729: ceph-osd takes all memory before oom on boot
- One more case:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/FQXV452YLHBJW6Y2UK7WUZP7HO5PVIA5/ - 10:13 PM Bug #53767: qa/workunits/cls/test_cls_2pc_queue.sh: killing an osd during thrashing causes timeout
- Same failed test, and same Traceback message as reported above. Pasted here is another relevant part of the log that ...
- 09:06 PM Bug #51076: "wait_for_recovery: failed before timeout expired" during thrashosd test with EC back...
- /a/yuriw-2022-01-11_19:17:55-rados-wip-yuri5-testing-2022-01-11-0843-distro-default-smithi/6608450
- 09:01 PM Bug #53875 (Duplicate): AssertionError: wait_for_recovery: failed before timeout expired due to d...
- Description: rados/thrash-erasure-code-big/{ceph cluster/{12-osds openstack} mon_election/connectivity msgr-failures/...
- 08:57 PM Bug #51904: test_pool_min_size:AssertionError:wait_for_clean:failed before timeout expired due to...
- /a/yuriw-2022-01-12_21:37:22-rados-wip-yuri6-testing-2022-01-12-1131-distro-default-smithi/6611439
last pg map bef...
01/12/2022
- 11:04 PM Bug #53729: ceph-osd takes all memory before oom on boot
- Gonzalo Aguilar Delgado wrote:
> Neha Ojha wrote:
> > Like the other case reported in the mailing list ([ceph-users... - 09:50 PM Bug #53729: ceph-osd takes all memory before oom on boot
- Gonzalo Aguilar Delgado wrote:
> Hi,
>
> The logs I've already provided had:
> --debug_osd 90 --debug_mon 2 --d... - 08:40 PM Bug #53729: ceph-osd takes all memory before oom on boot
- Neha Ojha wrote:
> Like the other case reported in the mailing list ([ceph-users] OSDs use 200GB RAM and crash) and ... - 08:38 PM Bug #53729: ceph-osd takes all memory before oom on boot
- Neha Ojha wrote:
> Like the other case reported in the mailing list ([ceph-users] OSDs use 200GB RAM and crash) and ... - 08:37 PM Bug #53729: ceph-osd takes all memory before oom on boot
- Hi,
The logs I've already provided had:
--debug_osd 90 --debug_mon 2 --debug_filestore 7 --debug_monc 99 --debug... - 06:24 PM Bug #53729: ceph-osd takes all memory before oom on boot
- Like the other case reported in the mailing list ([ceph-users] OSDs use 200GB RAM and crash) and https://tracker.ceph...
- 07:12 PM Bug #53855 (Resolved): rados/test.sh hangs while running LibRadosTwoPoolsPP.ManifestFlushDupCount
- Description: rados/basic/{ceph clusters/{fixed-2 openstack} mon_election/connectivity msgr-failures/many msgr/async-v...
- 07:08 PM Bug #53294: rados/test.sh hangs while running LibRadosTwoPoolsPP.TierFlushDuringFlush
- Later on in the example Neha originally posted (/a/yuriw-2021-11-15_19:24:05-rados-wip-yuri8-testing-2021-11-15-0845-...
- 06:55 PM Support #51609: OSD refuses to start (OOMK) due to pg split
- Tor Martin Ølberg wrote:
> Tor Martin Ølberg wrote:
> > After an upgrade to 15.2.13 from 15.2.4 my small home lab c... - 06:19 PM Bug #52124: Invalid read of size 8 in handle_recovery_delete()
- /a/yuriw-2022-01-11_19:17:55-rados-wip-yuri5-testing-2022-01-11-0843-distro-default-smithi/6608445/
- 10:03 AM Bug #50659: Segmentation fault under Pacific 16.2.1 when using a custom crush location hook
- This present in 16.2.7. Any reason why the linked PR wasn't merged into that release?
01/11/2022
- 08:47 PM Backport #53719 (In Progress): octopus: mon: frequent cpu_tp had timed out messages
- 08:33 PM Backport #53718 (In Progress): pacific: mon: frequent cpu_tp had timed out messages
- 08:31 PM Backport #53507 (Duplicate): pacific: ceph -s mon quorum age negative number
- Backport was handled along with https://github.com/ceph/ceph/pull/43698 in PR: https://github.com/ceph/ceph/pull/43698
- 08:29 PM Backport #53660 (In Progress): octopus: mon: "FAILED ceph_assert(session_map.sessions.empty())" w...
- 08:29 PM Backport #53659 (In Progress): pacific: mon: "FAILED ceph_assert(session_map.sessions.empty())" w...
- 08:27 PM Backport #53721 (Resolved): octopus: common: admin socket compiler warning
- The relevant code has already made it into Octopus, no further backport required.
- 08:27 PM Backport #53720 (Resolved): pacific: common: admin socket compiler warning
- The relevant code has already made it to Pacific, no further backport necessary.
- 08:14 PM Backport #53769 (In Progress): pacific: [ceph osd set noautoscale] Global on/off flag for PG auto...
- 08:14 PM Backport #53769: pacific: [ceph osd set noautoscale] Global on/off flag for PG autoscale feature
- https://github.com/ceph/ceph/pull/44540
- 01:55 PM Bug #53824 (Fix Under Review): Stretch mode: peering can livelock with acting set changes swappin...
- 12:14 AM Bug #53824: Stretch mode: peering can livelock with acting set changes swapping primary back and ...
- So, why is it accepting the non-acting-set member each time, when they seem to have the same data? There's a clue in ...
- 12:14 AM Bug #53824 (Pending Backport): Stretch mode: peering can livelock with acting set changes swappin...
- From https://bugzilla.redhat.com/show_bug.cgi?id=2025800
We're getting repeated swaps in the acting set, with logg... - 06:42 AM Bug #52319: LibRadosWatchNotify.WatchNotify2 fails
- /a/yuriw-2022-01-06_15:57:04-rados-wip-yuri6-testing-2022-01-05-1255-distro-default-smithi/6599471...
- 05:29 AM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
- /a/yuriw-2022-01-06_15:57:04-rados-wip-yuri6-testing-2022-01-05-1255-distro-default-smithi/6599449...
01/10/2022
- 10:20 PM Bug #53729: ceph-osd takes all memory before oom on boot
- Forget about previous comment.
The stack trace is just the opposite, seems that the call to encode in PGog::_writ... - 10:02 PM Bug #53729: ceph-osd takes all memory before oom on boot
- I was taking a look to:
3,1 GiB: OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*) (in /usr/bin/ce... - 09:37 PM Bug #53729: ceph-osd takes all memory before oom on boot
- I did something better. I added a new OSD with bluestore to see if it's a problem of the filestore backend.
Then ... - 09:43 AM Bug #52124: Invalid read of size 8 in handle_recovery_delete()
- /a/yuriw-2022-01-08_17:57:43-rados-wip-yuri8-testing-2022-01-07-1541-distro-default-smithi/6603232
- 02:22 AM Bug #53740: mon: all mon daemon always crash after rm pool
- Neha Ojha wrote:
> Do you happen to have a coredump from this crash?
No
01/07/2022
- 10:25 PM Bug #53789: CommandFailedError (rados/test_python.sh): "RADOS object not found" causes test_rados...
- /a/yuriw-2022-01-06_15:50:38-rados-wip-yuri8-testing-2022-01-05-1411-distro-default-smithi/6598917
- 10:09 PM Bug #48468: ceph-osd crash before being up again
- Igor Fedotov wrote:
> @neha, @Gonsalo - to avoid the mess let's use https://tracker.ceph.com/issues/53729 for furthe... - 09:56 PM Bug #48468: ceph-osd crash before being up again
- @neha, @Gonsalo - to avoid the mess let's use https://tracker.ceph.com/issues/53729 for further communication on the ...
- 06:22 PM Bug #48468: ceph-osd crash before being up again
- Gonzalo Aguilar Delgado wrote:
> Hi I'm having the same problem.
>
> -7> 2021-12-25T12:05:37.491+0100 7fd15c9... - 09:48 PM Bug #53729: ceph-osd takes all memory before oom on boot
- Looks relevant as well:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/YHR3P7N5EXCKNHK45L7FRF4XNBOC... - 09:20 PM Bug #50192: FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing().is_missing(soi...
- /a/yuriw-2022-01-06_15:50:38-rados-wip-yuri8-testing-2022-01-05-1411-distro-default-smithi/6599338
- 06:41 PM Bug #53806 (Resolved): unessesarily long laggy PG state
- the first `pg_lease_ack_t` after becoming laggy would not trigger `recheck_readable`. However, every other ack would ...
- 06:34 PM Bug #53740: mon: all mon daemon always crash after rm pool
- Do you happen to have a coredump from this crash?
01/06/2022
- 09:57 PM Bug #53789 (Pending Backport): CommandFailedError (rados/test_python.sh): "RADOS object not found...
- Description: rados/basic/{ceph clusters/{fixed-2 openstack} mon_election/connectivity msgr-failures/many msgr/async-v...
01/05/2022
- 09:56 PM Bug #53294: rados/test.sh hangs while running LibRadosTwoPoolsPP.TierFlushDuringFlush
- /a/yuriw-2022-01-04_21:52:15-rados-wip-yuri7-testing-2022-01-04-1159-distro-default-smithi/6595525
- 09:51 PM Bug #38357: ClsLock.TestExclusiveEphemeralStealEphemeral failed
- /a/yuriw-2022-01-04_21:52:15-rados-wip-yuri7-testing-2022-01-04-1159-distro-default-smithi/6595522
01/04/2022
- 11:31 PM Backport #53769 (Resolved): pacific: [ceph osd set noautoscale] Global on/off flag for PG autosca...
- 11:26 PM Feature #51213 (Pending Backport): [ceph osd set noautoscale] Global on/off flag for PG autoscale...
- 09:49 PM Bug #53768 (New): timed out waiting for admin_socket to appear after osd.2 restart in thrasher/de...
- Error snippet:
2022-01-02T01:37:09.296 DEBUG:teuthology.orchestra.run.smithi086:> sudo adjust-ulimits ceph-coverag... - 09:35 PM Bug #53767 (Duplicate): qa/workunits/cls/test_cls_2pc_queue.sh: killing an osd during thrashing c...
- Description: rados/verify/{centos_latest ceph clusters/{fixed-2 openstack} d-thrash/default/{default thrashosds-healt...
- 06:14 PM Bug #51945: qa/workunits/mon/caps.sh: Error: Expected return 13, got 0
- /a/yuriw-2021-12-23_16:50:03-rados-wip-yuri6-testing-2021-12-22-1410-distro-default-smithi/6582413...
- 12:37 PM Bug #23827: osd sends op_reply out of order
- Here is the log information on my environment.
h3. op1 is coming and send shards to peers osd
2021-12-01 18:27:... - 03:11 AM Bug #53757: I have a rados object that data size is 0, and this object have a large amount of oma...
- pr:https://github.com/ceph/ceph/pull/44450
- 02:59 AM Bug #53757 (Fix Under Review): I have a rados object that data size is 0, and this object have a ...
- Env:ceph version is 10.2.9, os is rhel7.8,and kernerl version is ' 3.13.0-86-generic'
1、cereat some rados objects ...
01/02/2022
- 06:58 PM Bug #53751: "N monitors have not enabled msgr2" is always shown for new clusters
- Another thing I don't understand from the docs:
https://docs.ceph.com/en/pacific/rados/configuration/msgr2/#transi... - 06:34 PM Bug #53751 (Need More Info): "N monitors have not enabled msgr2" is always shown for new clusters
- I am experiencing that for new clusters (currently Ceph 16.2.7), `ceph status` always shows e.g.:
3 monitors h...
12/31/2021
- 01:11 PM Bug #48468: ceph-osd crash before being up again
- Hey Gonzalo,
It was some times ago but from my memories I've created a hudge swapfile ~50G and I restarted the os... - 04:15 AM Bug #53749 (New): ceph device scrape-health-metrics truncates smartctl/nvme output to 100 KiB
- * https://github.com/ceph/ceph/blob/ae17c0a0c319c42d822e4618fd0d1c52c9b07ed1/src/common/blkdev.cc#L729
* https://git...
12/30/2021
- 11:42 PM Bug #51626: OSD uses all host memory (80g) on startup due to pg_split
- Gonzalo Aguilar Delgado wrote:
> Any update? I have the same trouble...
>
> I downgraded kernel to 4.XX because w... - 01:17 PM Bug #51626: OSD uses all host memory (80g) on startup due to pg_split
- Any update? I have the same trouble...
I downgraded kernel to 4.XX because with newer kernel I cannot even get thi...
12/29/2021
- 08:00 AM Bug #53740 (Resolved): mon: all mon daemon always crash after rm pool
- We have an openstack cluster.Last week we started clearing all openstack instances and deleting all ceph pools.All mo...
12/28/2021
- 04:25 AM Bug #50775: mds and osd unable to obtain rotating service keys
- We ran into this issue, too. Our environment is a multi-host cluster (v15.2.9). Sometimes, we can observe that "unabl...
12/27/2021
- 12:34 AM Bug #53729: ceph-osd takes all memory before oom on boot
- Could you please set debug-osd to 5/20 and share relevant OSD startup log?
12/25/2021
- 09:30 PM Bug #53729 (Resolved): ceph-osd takes all memory before oom on boot
- Hi, I cannot boot half of my OSD all of them die by oom killed.
It seems they are taking all the memory. Everythi... - 09:03 PM Bug #48468: ceph-osd crash before being up again
- Clément Hampaï wrote:
> Hi Sage,
>
> Hum I've finally managed to recover my cluster after an uncounted osd resta... - 09:03 PM Bug #48468: ceph-osd crash before being up again
- Hi I'm having the same problem.
-7> 2021-12-25T12:05:37.491+0100 7fd15c920640 1 heartbeat_map reset_timeout '...
12/23/2021
- 07:28 PM Bug #52925 (Closed): pg peering alway after trigger async recovery
- As per https://github.com/ceph/ceph/pull/43534#issuecomment-984252587
- 07:27 PM Backport #53721 (Resolved): octopus: common: admin socket compiler warning
- 07:27 PM Backport #53720 (Resolved): pacific: common: admin socket compiler warning
- 07:25 PM Backport #53719 (Resolved): octopus: mon: frequent cpu_tp had timed out messages
- https://github.com/ceph/ceph/pull/44546
- 07:25 PM Backport #53718 (Resolved): pacific: mon: frequent cpu_tp had timed out messages
- https://github.com/ceph/ceph/pull/44545
- 07:24 PM Bug #43266 (Pending Backport): common: admin socket compiler warning
- 07:21 PM Bug #53506 (Pending Backport): mon: frequent cpu_tp had timed out messages
- 07:17 PM Bug #52124: Invalid read of size 8 in handle_recovery_delete()
- /a/yuriw-2021-12-22_22:11:35-rados-wip-yuri3-testing-2021-12-22-1047-distro-default-smithi/6580187
- 07:14 PM Bug #52124: Invalid read of size 8 in handle_recovery_delete()
- /a/yuriw-2021-12-22_22:11:35-rados-wip-yuri3-testing-2021-12-22-1047-distro-default-smithi/6580436
- 02:07 PM Bug #52509: PG merge: PG stuck in premerge+peered state
- Neha Ojha wrote:
> Konstantin Shalygin wrote:
> > We can plan and spent time to setup staging cluster for this and ... - 10:17 AM Bug #52509: PG merge: PG stuck in premerge+peered state
- @Markus just for the record, what is your Ceph version? And what is your hardware for OSD's? The actual issue was on ...
- 10:45 AM Bug #53327: osd: osd_fast_shutdown_notify_mon not quite right and enable osd_fast_shutdown_notify...
- Hi Sage,
is there some update?
- 09:32 AM Backport #53701 (In Progress): octopus: qa/tasks/backfill_toofull.py: AssertionError: 2.0 not in ...
- PR: https://github.com/ceph/ceph/pull/43438
- 09:25 AM Backport #53702 (In Progress): pacific: qa/tasks/backfill_toofull.py: AssertionError: 2.0 not in ...
- 03:05 AM Bug #49689: osd/PeeringState.cc: ceph_abort_msg("past_interval start interval mismatch") start
- 匿名用户 wrote:
> Neha Ojha wrote:
> > [...]
> >
> > looks like this pg had the same pi when it got created
> >
>...
12/22/2021
- 05:25 PM Backport #53702 (Resolved): pacific: qa/tasks/backfill_toofull.py: AssertionError: 2.0 not in bac...
- https://github.com/ceph/ceph/pull/44387
- 05:25 PM Backport #53701 (Resolved): octopus: qa/tasks/backfill_toofull.py: AssertionError: 2.0 not in bac...
- https://github.com/ceph/ceph/pull/43438
- 05:23 PM Bug #53677 (Pending Backport): qa/tasks/backfill_toofull.py: AssertionError: 2.0 not in backfilling
- 04:06 PM Bug #53677 (Fix Under Review): qa/tasks/backfill_toofull.py: AssertionError: 2.0 not in backfilling
- 03:44 PM Bug #47589: radosbench times out "reached maximum tries (800) after waiting for 4800 seconds"
- /a/yuriw-2021-12-21_18:01:07-rados-wip-yuri3-testing-2021-12-21-0749-distro-default-smithi/6576331/
- 11:53 AM Bug #53663: Random scrub errors (omap_digest_mismatch) on pgs of RADOSGW metadata pools
- I asked on the ML about this issue - see thread here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/messag...
- 03:40 AM Bug #49689: osd/PeeringState.cc: ceph_abort_msg("past_interval start interval mismatch") start
- Neha Ojha wrote:
> [...]
>
> looks like this pg had the same pi when it got created
>
> [...]
>
> but get_r...
12/21/2021
- 05:37 PM Bug #53677 (In Progress): qa/tasks/backfill_toofull.py: AssertionError: 2.0 not in backfilling
- This looks like a false test failure due to the test setting backfillfull ratio too low.
We see that the test calc... - 11:22 AM Bug #53685 (New): Assertion `HAVE_FEATURE(features, SERVER_OCTOPUS)' failed.
- Test "rados/verify/{centos_latest ceph clusters/{fixed-2 openstack} d-thrash/default/{default thrashosds-health} mon_...
12/20/2021
- 05:30 PM Bug #53677 (Resolved): qa/tasks/backfill_toofull.py: AssertionError: 2.0 not in backfilling
- ...
- 11:51 AM Bug #23827: osd sends op_reply out of order
- This bug occurred in my online environment(Nautilus 14.2.5) some days ago and my application exited because client’s ...
- 09:10 AM Bug #53667: osd cannot be started after being set to stop
- fix in https://github.com/ceph/ceph/pull/44363
- 08:55 AM Bug #53667 (Fix Under Review): osd cannot be started after being set to stop
- after setting osd stop, osd cannot be pulled up again
[root@controller-2 ~]# ceph osd status
ID HOST US...
Also available in: Atom