Project

General

Profile

Activity

From 12/28/2021 to 01/26/2022

01/26/2022

11:54 PM Backport #53534: octopus: mon: mgrstatmonitor spams mgr with service_map
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/44722
merged
Yuri Weinstein
11:39 PM Backport #53769: pacific: [ceph osd set noautoscale] Global on/off flag for PG autoscale feature
Kamoltat Sirivadhna wrote:
> https://github.com/ceph/ceph/pull/44540
merged
Yuri Weinstein
08:53 PM Bug #53729: ceph-osd takes all memory before oom on boot
In the mean time, Neha mentioned that you might be able to prevent the pgs from splitting by turning off the autoscal... Mark Nelson
08:34 PM Bug #53729: ceph-osd takes all memory before oom on boot
Hi Gonzalo,
I'm not an expert regarding this code so please take my reply here with a grain of salt (and others pl...
Mark Nelson
05:26 PM Bug #53729: ceph-osd takes all memory before oom on boot
How can I help to accelerate a bugfix or workaround?
If comment your investigations I can builld a docker image t...
Gonzalo Aguilar Delgado
04:16 PM Bug #53326: pgs wait for read lease after osd start
https://github.com/ceph/ceph/pull/44585 merged Yuri Weinstein
12:27 AM Bug #53326: pgs wait for read lease after osd start
https://github.com/ceph/ceph/pull/44584 merged Yuri Weinstein
04:14 PM Backport #53701: octopus: qa/tasks/backfill_toofull.py: AssertionError: 2.0 not in backfilling
Mykola Golub wrote:
> PR: https://github.com/ceph/ceph/pull/43438
merged
Yuri Weinstein
04:14 PM Backport #52833: octopus: osd: pg may get stuck in backfill_toofull after backfill is interrupted...
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/43438
merged
Yuri Weinstein
12:06 PM Bug #53142: OSD crash in PG::do_delete_work when increasing PGs
>Igor Fedotov wrote:
> I doubt anyone can say what setup would be good for you without experiments in the field. M...
Ist Gab
12:04 PM Bug #44184: Slow / Hanging Ops after pool creation
Neha Ojha wrote:
> Are you still seeing this problem? Will you be able to provide debug data around this issue?
H...
Ist Gab
12:47 AM Bug #45318 (New): Health check failed: 2/6 mons down, quorum b,a,c,e (MON_DOWN)" in cluster log r...
Octopus still has this issue /a/yuriw-2022-01-24_18:01:47-rados-wip-yuri10-testing-2022-01-24-0810-octopus-distro-def... Neha Ojha

01/25/2022

05:40 PM Bug #50608 (Need More Info): ceph_assert(is_primary()) in PrimaryLogPG::on_local_recover
Neha Ojha
05:36 PM Bug #52503: cli_generic.sh: slow ops when trying rand write on cache pools
Here is a representative run (wip-dis-testing is essentially master):
https://pulpito.ceph.com/dis-2022-01-25_16:1...
Ilya Dryomov
12:50 AM Bug #52503: cli_generic.sh: slow ops when trying rand write on cache pools
Ilya Dryomov wrote:
> This has been bugging the rbd suite for a while. I don't think messenger failure injection is...
Neha Ojha
01:34 PM Bug #53327 (In Progress): osd: osd_fast_shutdown_notify_mon not quite right and enable osd_fast_s...
Nitzan Mordechai
10:39 AM Backport #53944 (In Progress): pacific: [RFE] Limit slow request details to mgr log
Prashant D
09:01 AM Backport #53978 (In Progress): quincy: [RFE] Limit slow request details to mgr log
Prashant D
08:51 AM Bug #54005 (Duplicate): Why can wrong parameters be specified when creating erasure-code-profile,...
My osd tree is like below:
ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-7 0....
wang kevin
08:45 AM Bug #54004 (Rejected): When creating erasure-code-profile incorrectly set parameters, it can be c...
My osd tree is like below:
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-7 0.19498 root mytest...
wang kevin

01/24/2022

11:19 PM Bug #52503: cli_generic.sh: slow ops when trying rand write on cache pools
This has been bugging the rbd suite for a while. I don't think messenger failure injection is the problem because th... Ilya Dryomov
11:12 PM Bug #53327 (New): osd: osd_fast_shutdown_notify_mon not quite right and enable osd_fast_shutdown_...
Neha Ojha
10:59 PM Bug #53940 (Rejected): EC pool creation is setting min_size to K+1 instead of K
As discussed offline, we should revisit our recovery test coverage for various EC profiles, but closing this issue. Neha Ojha
10:56 PM Bug #52621 (Can't reproduce): cephx: verify_authorizer could not decrypt ticket info: error: bad ...
Neha Ojha
10:44 PM Bug #44184: Slow / Hanging Ops after pool creation
Ist Gab wrote:
> Neha Ojha wrote:
>
> > Which version are you using?
>
> Octopus 15.2.14
Are you still seei...
Neha Ojha
10:39 PM Bug #52535 (Need More Info): monitor crashes after an OSD got destroyed: OSDMap.cc: 5686: FAILED ...
Neha Ojha
10:37 PM Bug #48997 (Can't reproduce): rados/singleton/all/recovery-preemption: defer backfill|defer recov...
Neha Ojha
10:36 PM Bug #50106 (Can't reproduce): scrub/osd-scrub-repair.sh: corrupt_scrub_erasure: return 1
Neha Ojha
10:36 PM Bug #50245 (Can't reproduce): TEST_recovery_scrub_2: Not enough recovery started simultaneously
Neha Ojha
10:35 PM Bug #49961 (Can't reproduce): scrub/osd-recovery-scrub.sh: TEST_recovery_scrub_1 failed
Neha Ojha
10:35 PM Bug #46847 (Need More Info): Loss of placement information on OSD reboot
Is this issue reproducible in Octopus or later? Neha Ojha
10:32 PM Bug #50462 (Won't Fix - EOL): OSDs crash in osd/osd_types.cc: FAILED ceph_assert(clone_overlap.co...
Please feel free to reopen if you see the issue in a recent version of Ceph.
Neha Ojha
10:31 PM Bug #49688 (Can't reproduce): FAILED ceph_assert(is_primary()) in submit_log_entries during Promo...
Neha Ojha
10:30 PM Bug #48028 (Won't Fix - EOL): ceph-mon always suffer lots of slow ops from v14.2.9
Please feel free to reopen if you see the issue in a recent version of Ceph. Neha Ojha
10:29 PM Bug #50512 (Won't Fix - EOL): upgrade:nautilus-p2p-nautilus: unhandled event in ToDelete
Neha Ojha
10:29 PM Bug #50473 (Can't reproduce): ceph_test_rados_api_lock_pp segfault in librados::v14_2_0::RadosCli...
Neha Ojha
10:28 PM Bug #50242 (Can't reproduce): test_repair_corrupted_obj fails with assert not inconsistent
Neha Ojha
10:28 PM Bug #50119 (Can't reproduce): Invalid read of size 4 in ceph::logging::Log::dump_recent()
Neha Ojha
10:26 PM Bug #47153 (Won't Fix - EOL): monitor crash during upgrade due to LogSummary encoding changes bet...
Neha Ojha
10:26 PM Bug #49523: rebuild-mondb doesn't populate mgr commands -> pg dump EINVAL
Haven't seen this in recent runs. Neha Ojha
10:24 PM Bug #49463 (Can't reproduce): qa/standalone/misc/rados-striper.sh: Caught signal in thread_name:r...
Neha Ojha
10:14 PM Bug #53910 (Closed): client: client session state stuck in opening and hang all the time
Neha Ojha
10:43 AM Bug #47273: ceph report missing osdmap_clean_epochs if answered by peon
> Is it possible that this is related?
I'm not sure, but I guess not.
I think this bug is rather about not forwa...
Dan van der Ster
06:05 AM Bug #52486 (Pending Backport): test tracker: please ignore
Deepika Upadhyay

01/22/2022

12:06 AM Backport #53978 (Resolved): quincy: [RFE] Limit slow request details to mgr log
https://github.com/ceph/ceph/pull/44764 Backport Bot
12:05 AM Backport #53977 (Rejected): quincy: mon: all mon daemon always crash after rm pool
Backport Bot
12:05 AM Backport #53974 (Resolved): quincy: BufferList.rebuild_aligned_size_and_memory failure
Backport Bot

01/21/2022

07:30 PM Backport #53972 (Resolved): pacific: BufferList.rebuild_aligned_size_and_memory failure
Backport Bot
07:25 PM Backport #53971 (Resolved): octopus: BufferList.rebuild_aligned_size_and_memory failure
Backport Bot
07:22 PM Bug #53969 (Pending Backport): BufferList.rebuild_aligned_size_and_memory failure
Neha Ojha
07:15 PM Bug #53969 (Fix Under Review): BufferList.rebuild_aligned_size_and_memory failure
Neha Ojha
07:14 PM Bug #53969 (Resolved): BufferList.rebuild_aligned_size_and_memory failure
... Neha Ojha
06:59 PM Bug #45345 (Can't reproduce): tasks/rados.py fails with "psutil.NoSuchProcess: psutil.NoSuchProce...
Neha Ojha
06:58 PM Bug #45318 (Can't reproduce): Health check failed: 2/6 mons down, quorum b,a,c,e (MON_DOWN)" in c...
Neha Ojha
06:56 PM Bug #38375: OSD segmentation fault on rbd create
I do not have the files to reupload so might be worth closing this out as I have moved on to another release and this... Ryan Farrington
06:53 PM Bug #43553 (Can't reproduce): mon: client mon_status fails
Neha Ojha
06:49 PM Bug #43048 (Won't Fix - EOL): nautilus: upgrade/mimic-x/stress-split: failed to recover before ti...
Neha Ojha
06:48 PM Bug #42102 (Can't reproduce): use-after-free in Objecter timer handing
Neha Ojha
06:43 PM Bug #40521 (Can't reproduce): cli timeout (e.g., ceph pg dump)
Neha Ojha
06:38 PM Bug #23911 (Won't Fix - EOL): ceph:luminous: osd out/down when setup with ubuntu/bluestore
Neha Ojha
06:37 PM Bug #20952 (Can't reproduce): Glitchy monitor quorum causes spurious test failure
Neha Ojha
06:36 PM Bug #14115 (Can't reproduce): crypto: race in nss init
Neha Ojha
06:36 PM Bug #13385 (Can't reproduce): cephx: verify_authorizer could not decrypt ticket info: error: NSS ...
Neha Ojha
06:35 PM Bug #11235 (Can't reproduce): test_rados.py test_aio_read is racy
Neha Ojha
05:24 PM Backport #53534 (In Progress): octopus: mon: mgrstatmonitor spams mgr with service_map
Cory Snyder
05:22 PM Backport #53535 (In Progress): pacific: mon: mgrstatmonitor spams mgr with service_map
Cory Snyder
03:55 PM Bug #47273: ceph report missing osdmap_clean_epochs if answered by peon
I am also seeing this behavior on the latest Octopus and Pacific releases.
The reason I'm looking is that I'm seei...
Steve Taylor

01/20/2022

10:24 PM Bug #53940: EC pool creation is setting min_size to K+1 instead of K
Laura Flores wrote:
> Thanks for this info, Dan. We have held off on making a change to min_size, and we're currentl...
Vikhyat Umrao
08:16 PM Bug #53940: EC pool creation is setting min_size to K+1 instead of K
Thanks for this info, Dan. We have held off on making a change to min_size, and we're currently discussing ways to en... Laura Flores
07:27 PM Backport #53943 (In Progress): octopus: mon: all mon daemon always crash after rm pool
Cory Snyder
06:44 PM Backport #53942 (In Progress): pacific: mon: all mon daemon always crash after rm pool
Cory Snyder
06:29 AM Bug #53910: client: client session state stuck in opening and hang all the time
Sorry, close this issue please. Ivan Guan
02:00 AM Backport #53944 (Resolved): pacific: [RFE] Limit slow request details to mgr log
https://github.com/ceph/ceph/pull/44771 Backport Bot
01:21 AM Feature #52424 (Pending Backport): [RFE] Limit slow request details to mgr log
Vikhyat Umrao
01:13 AM Bug #53924: EC PG stuckrecovery_unfound+undersized+degraded+remapped+peered
We have marked the primary OSD.33 down [1] and it has helped the stuck recovery_unfound pg to get unstuck and recover... Vikhyat Umrao

01/19/2022

11:18 PM Bug #53855: rados/test.sh hangs while running LibRadosTwoPoolsPP.ManifestFlushDupCount
Myoungwon Oh: any ideas on this bug?
Neha Ojha
11:15 PM Bug #53875 (Duplicate): AssertionError: wait_for_recovery: failed before timeout expired due to d...
Neha Ojha
11:15 PM Backport #53943 (Resolved): octopus: mon: all mon daemon always crash after rm pool
https://github.com/ceph/ceph/pull/44700 Backport Bot
11:10 PM Backport #53942 (Resolved): pacific: mon: all mon daemon always crash after rm pool
https://github.com/ceph/ceph/pull/44698 Backport Bot
11:09 PM Bug #53910 (Need More Info): client: client session state stuck in opening and hang all the time
Can you provide more details about this bug? Neha Ojha
11:05 PM Bug #53740 (Pending Backport): mon: all mon daemon always crash after rm pool
Neha Ojha
09:00 PM Bug #53924: EC PG stuckrecovery_unfound+undersized+degraded+remapped+peered
Looks like the last time the PG was active was at "2022-01-18T17:38:23.338"... Neha Ojha
07:26 PM Bug #53940: EC pool creation is setting min_size to K+1 instead of K
For history, here's where the default was set to k+1.
https://github.com/ceph/ceph/pull/8008/commits/48e40fcde7b19...
Dan van der Ster
06:53 PM Bug #53940 (Rejected): EC pool creation is setting min_size to K+1 instead of K
For more information please check the RHCS bug - https://bugzilla.redhat.com/show_bug.cgi?id=2039585. Vikhyat Umrao
03:33 PM Bug #53923 (In Progress): [Upgrade] mgr FAILED to decode MSG_PGSTATS
Neha Ojha
02:07 PM Bug #44092 (Fix Under Review): mon: config commands do not accept whitespace style config name
Patrick Donnelly
01:55 PM Backport #53933 (In Progress): pacific: Stretch mode: peering can livelock with acting set change...
Greg Farnum
01:50 PM Backport #53933 (Resolved): pacific: Stretch mode: peering can livelock with acting set changes s...
https://github.com/ceph/ceph/pull/44664 Backport Bot
01:46 PM Bug #53824 (Pending Backport): Stretch mode: peering can livelock with acting set changes swappin...
Greg Farnum

01/18/2022

09:20 PM Bug #53924: EC PG stuckrecovery_unfound+undersized+degraded+remapped+peered
Ceph OSD 33 Logs with grep unfound! Vikhyat Umrao
09:14 PM Bug #53924: EC PG stuckrecovery_unfound+undersized+degraded+remapped+peered
Ceph PG query! Vikhyat Umrao
09:11 PM Bug #53924 (Need More Info): EC PG stuckrecovery_unfound+undersized+degraded+remapped+peered
... Vikhyat Umrao
08:36 PM Bug #53923 (Resolved): [Upgrade] mgr FAILED to decode MSG_PGSTATS
... Vikhyat Umrao
05:42 PM Bug #51076: "wait_for_recovery: failed before timeout expired" during thrashosd test with EC back...
/a/yuriw-2022-01-15_05:47:18-rados-wip-yuri8-testing-2022-01-14-1551-distro-default-smithi/6619577
/a/yuriw-2022-01-...
Laura Flores
04:23 PM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
/a/yuriw-2022-01-14_23:22:09-rados-wip-yuri6-testing-2022-01-14-1207-distro-default-smithi/6617813 Laura Flores
08:26 AM Bug #53910 (Closed): client: client session state stuck in opening and hang all the time
Ivan Guan

01/16/2022

08:40 PM Bug #53729: ceph-osd takes all memory before oom on boot
Do you need something else to find a workaround or the full solution?
Is there anything I can do?
Gonzalo Aguilar Delgado

01/14/2022

11:21 PM Bug #53895 (Resolved): Unable to format `ceph config dump` command output in yaml using `-f yaml`
https://bugzilla.redhat.com/show_bug.cgi?id=2040709 Vikhyat Umrao
10:45 AM Bug #43266 (Resolved): common: admin socket compiler warning
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Loïc Dachary
07:56 AM Bug #43887: ceph_test_rados_delete_pools_parallel failure
/a/yuriw-2022-01-13_18:06:52-rados-wip-yuri3-testing-2022-01-13-0809-distro-default-smithi/6614510 Aishwarya Mathuria
12:26 AM Backport #53877 (In Progress): octopus: pgs wait for read lease after osd start
Vikhyat Umrao
12:12 AM Backport #53876 (In Progress): pacific: pgs wait for read lease after osd start
Vikhyat Umrao

01/13/2022

11:15 PM Backport #53877 (Resolved): octopus: pgs wait for read lease after osd start
https://github.com/ceph/ceph/pull/44585 Backport Bot
11:15 PM Backport #53876 (Resolved): pacific: pgs wait for read lease after osd start
https://github.com/ceph/ceph/pull/44584 Backport Bot
11:11 PM Bug #53326 (Pending Backport): pgs wait for read lease after osd start
Neha Ojha
10:54 PM Bug #53729: ceph-osd takes all memory before oom on boot
Neha Ojha wrote:
> Gonzalo Aguilar Delgado wrote:
> > Neha Ojha wrote:
> > > Like the other case reported in the m...
Gonzalo Aguilar Delgado
10:52 PM Bug #53729: ceph-osd takes all memory before oom on boot
Igor Fedotov wrote:
> One more case:
> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/FQXV452YLHBJ...
Gonzalo Aguilar Delgado
12:23 PM Bug #53729: ceph-osd takes all memory before oom on boot
One more case:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/FQXV452YLHBJW6Y2UK7WUZP7HO5PVIA5/
Igor Fedotov
10:13 PM Bug #53767: qa/workunits/cls/test_cls_2pc_queue.sh: killing an osd during thrashing causes timeout
Same failed test, and same Traceback message as reported above. Pasted here is another relevant part of the log that ... Laura Flores
09:06 PM Bug #51076: "wait_for_recovery: failed before timeout expired" during thrashosd test with EC back...
/a/yuriw-2022-01-11_19:17:55-rados-wip-yuri5-testing-2022-01-11-0843-distro-default-smithi/6608450 Laura Flores
09:01 PM Bug #53875 (Duplicate): AssertionError: wait_for_recovery: failed before timeout expired due to d...
Description: rados/thrash-erasure-code-big/{ceph cluster/{12-osds openstack} mon_election/connectivity msgr-failures/... Laura Flores
08:57 PM Bug #51904: test_pool_min_size:AssertionError:wait_for_clean:failed before timeout expired due to...
/a/yuriw-2022-01-12_21:37:22-rados-wip-yuri6-testing-2022-01-12-1131-distro-default-smithi/6611439
last pg map bef...
Laura Flores

01/12/2022

11:04 PM Bug #53729: ceph-osd takes all memory before oom on boot
Gonzalo Aguilar Delgado wrote:
> Neha Ojha wrote:
> > Like the other case reported in the mailing list ([ceph-users...
Neha Ojha
09:50 PM Bug #53729: ceph-osd takes all memory before oom on boot
Gonzalo Aguilar Delgado wrote:
> Hi,
>
> The logs I've already provided had:
> --debug_osd 90 --debug_mon 2 --d...
Neha Ojha
08:40 PM Bug #53729: ceph-osd takes all memory before oom on boot
Neha Ojha wrote:
> Like the other case reported in the mailing list ([ceph-users] OSDs use 200GB RAM and crash) and ...
Gonzalo Aguilar Delgado
08:38 PM Bug #53729: ceph-osd takes all memory before oom on boot
Neha Ojha wrote:
> Like the other case reported in the mailing list ([ceph-users] OSDs use 200GB RAM and crash) and ...
Gonzalo Aguilar Delgado
08:37 PM Bug #53729: ceph-osd takes all memory before oom on boot
Hi,
The logs I've already provided had:
--debug_osd 90 --debug_mon 2 --debug_filestore 7 --debug_monc 99 --debug...
Gonzalo Aguilar Delgado
06:24 PM Bug #53729: ceph-osd takes all memory before oom on boot
Like the other case reported in the mailing list ([ceph-users] OSDs use 200GB RAM and crash) and https://tracker.ceph... Neha Ojha
07:12 PM Bug #53855 (Resolved): rados/test.sh hangs while running LibRadosTwoPoolsPP.ManifestFlushDupCount
Description: rados/basic/{ceph clusters/{fixed-2 openstack} mon_election/connectivity msgr-failures/many msgr/async-v... Laura Flores
07:08 PM Bug #53294: rados/test.sh hangs while running LibRadosTwoPoolsPP.TierFlushDuringFlush
Later on in the example Neha originally posted (/a/yuriw-2021-11-15_19:24:05-rados-wip-yuri8-testing-2021-11-15-0845-... Laura Flores
06:55 PM Support #51609: OSD refuses to start (OOMK) due to pg split
Tor Martin Ølberg wrote:
> Tor Martin Ølberg wrote:
> > After an upgrade to 15.2.13 from 15.2.4 my small home lab c...
Neha Ojha
06:19 PM Bug #52124: Invalid read of size 8 in handle_recovery_delete()
/a/yuriw-2022-01-11_19:17:55-rados-wip-yuri5-testing-2022-01-11-0843-distro-default-smithi/6608445/ Laura Flores
10:03 AM Bug #50659: Segmentation fault under Pacific 16.2.1 when using a custom crush location hook
This present in 16.2.7. Any reason why the linked PR wasn't merged into that release? Janek Bevendorff

01/11/2022

08:47 PM Backport #53719 (In Progress): octopus: mon: frequent cpu_tp had timed out messages
Cory Snyder
08:33 PM Backport #53718 (In Progress): pacific: mon: frequent cpu_tp had timed out messages
Cory Snyder
08:31 PM Backport #53507 (Duplicate): pacific: ceph -s mon quorum age negative number
Backport was handled along with https://github.com/ceph/ceph/pull/43698 in PR: https://github.com/ceph/ceph/pull/43698 Cory Snyder
08:29 PM Backport #53660 (In Progress): octopus: mon: "FAILED ceph_assert(session_map.sessions.empty())" w...
Cory Snyder
08:29 PM Backport #53659 (In Progress): pacific: mon: "FAILED ceph_assert(session_map.sessions.empty())" w...
Cory Snyder
08:27 PM Backport #53721 (Resolved): octopus: common: admin socket compiler warning
The relevant code has already made it into Octopus, no further backport required. Cory Snyder
08:27 PM Backport #53720 (Resolved): pacific: common: admin socket compiler warning
The relevant code has already made it to Pacific, no further backport necessary. Cory Snyder
08:14 PM Backport #53769 (In Progress): pacific: [ceph osd set noautoscale] Global on/off flag for PG auto...
Kamoltat (Junior) Sirivadhna
08:14 PM Backport #53769: pacific: [ceph osd set noautoscale] Global on/off flag for PG autoscale feature
https://github.com/ceph/ceph/pull/44540 Kamoltat (Junior) Sirivadhna
01:55 PM Bug #53824 (Fix Under Review): Stretch mode: peering can livelock with acting set changes swappin...
Greg Farnum
12:14 AM Bug #53824: Stretch mode: peering can livelock with acting set changes swapping primary back and ...
So, why is it accepting the non-acting-set member each time, when they seem to have the same data? There's a clue in ... Greg Farnum
12:14 AM Bug #53824 (Pending Backport): Stretch mode: peering can livelock with acting set changes swappin...
From https://bugzilla.redhat.com/show_bug.cgi?id=2025800
We're getting repeated swaps in the acting set, with logg...
Greg Farnum
06:42 AM Bug #52319: LibRadosWatchNotify.WatchNotify2 fails
/a/yuriw-2022-01-06_15:57:04-rados-wip-yuri6-testing-2022-01-05-1255-distro-default-smithi/6599471... Sridhar Seshasayee
05:29 AM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
/a/yuriw-2022-01-06_15:57:04-rados-wip-yuri6-testing-2022-01-05-1255-distro-default-smithi/6599449... Sridhar Seshasayee

01/10/2022

10:20 PM Bug #53729: ceph-osd takes all memory before oom on boot
Forget about previous comment.
The stack trace is just the opposite, seems that the call to encode in PGog::_writ...
Gonzalo Aguilar Delgado
10:02 PM Bug #53729: ceph-osd takes all memory before oom on boot
I was taking a look to:
3,1 GiB: OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*) (in /usr/bin/ce...
Gonzalo Aguilar Delgado
09:37 PM Bug #53729: ceph-osd takes all memory before oom on boot
I did something better. I added a new OSD with bluestore to see if it's a problem of the filestore backend.
Then ...
Gonzalo Aguilar Delgado
09:43 AM Bug #52124: Invalid read of size 8 in handle_recovery_delete()
/a/yuriw-2022-01-08_17:57:43-rados-wip-yuri8-testing-2022-01-07-1541-distro-default-smithi/6603232 Sridhar Seshasayee
02:22 AM Bug #53740: mon: all mon daemon always crash after rm pool
Neha Ojha wrote:
> Do you happen to have a coredump from this crash?
No
Taizeng Wu

01/07/2022

10:25 PM Bug #53789: CommandFailedError (rados/test_python.sh): "RADOS object not found" causes test_rados...
/a/yuriw-2022-01-06_15:50:38-rados-wip-yuri8-testing-2022-01-05-1411-distro-default-smithi/6598917 Laura Flores
10:09 PM Bug #48468: ceph-osd crash before being up again
Igor Fedotov wrote:
> @neha, @Gonsalo - to avoid the mess let's use https://tracker.ceph.com/issues/53729 for furthe...
Neha Ojha
09:56 PM Bug #48468: ceph-osd crash before being up again
@neha, @Gonsalo - to avoid the mess let's use https://tracker.ceph.com/issues/53729 for further communication on the ... Igor Fedotov
06:22 PM Bug #48468: ceph-osd crash before being up again
Gonzalo Aguilar Delgado wrote:
> Hi I'm having the same problem.
>
> -7> 2021-12-25T12:05:37.491+0100 7fd15c9...
Neha Ojha
09:48 PM Bug #53729: ceph-osd takes all memory before oom on boot
Looks relevant as well:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/YHR3P7N5EXCKNHK45L7FRF4XNBOC...
Igor Fedotov
09:20 PM Bug #50192: FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing().is_missing(soi...
/a/yuriw-2022-01-06_15:50:38-rados-wip-yuri8-testing-2022-01-05-1411-distro-default-smithi/6599338 Laura Flores
06:41 PM Bug #53806 (Resolved): unessesarily long laggy PG state
the first `pg_lease_ack_t` after becoming laggy would not trigger `recheck_readable`. However, every other ack would ... 玮文 胡
06:34 PM Bug #53740: mon: all mon daemon always crash after rm pool
Do you happen to have a coredump from this crash? Neha Ojha

01/06/2022

09:57 PM Bug #53789 (Pending Backport): CommandFailedError (rados/test_python.sh): "RADOS object not found...
Description: rados/basic/{ceph clusters/{fixed-2 openstack} mon_election/connectivity msgr-failures/many msgr/async-v... Laura Flores

01/05/2022

09:56 PM Bug #53294: rados/test.sh hangs while running LibRadosTwoPoolsPP.TierFlushDuringFlush
/a/yuriw-2022-01-04_21:52:15-rados-wip-yuri7-testing-2022-01-04-1159-distro-default-smithi/6595525 Laura Flores
09:51 PM Bug #38357: ClsLock.TestExclusiveEphemeralStealEphemeral failed
/a/yuriw-2022-01-04_21:52:15-rados-wip-yuri7-testing-2022-01-04-1159-distro-default-smithi/6595522 Laura Flores

01/04/2022

11:31 PM Backport #53769 (Resolved): pacific: [ceph osd set noautoscale] Global on/off flag for PG autosca...
Backport Bot
11:26 PM Feature #51213 (Pending Backport): [ceph osd set noautoscale] Global on/off flag for PG autoscale...
Vikhyat Umrao
09:49 PM Bug #53768 (New): timed out waiting for admin_socket to appear after osd.2 restart in thrasher/de...
Error snippet:
2022-01-02T01:37:09.296 DEBUG:teuthology.orchestra.run.smithi086:> sudo adjust-ulimits ceph-coverag...
Joseph Sawaya
09:35 PM Bug #53767 (Duplicate): qa/workunits/cls/test_cls_2pc_queue.sh: killing an osd during thrashing c...
Description: rados/verify/{centos_latest ceph clusters/{fixed-2 openstack} d-thrash/default/{default thrashosds-healt... Laura Flores
06:14 PM Bug #51945: qa/workunits/mon/caps.sh: Error: Expected return 13, got 0
/a/yuriw-2021-12-23_16:50:03-rados-wip-yuri6-testing-2021-12-22-1410-distro-default-smithi/6582413... Laura Flores
12:37 PM Bug #23827: osd sends op_reply out of order
Here is the log information on my environment.
h3. op1 is coming and send shards to peers osd
2021-12-01 18:27:...
Ivan Guan
03:11 AM Bug #53757: I have a rados object that data size is 0, and this object have a large amount of oma...
pr:https://github.com/ceph/ceph/pull/44450 xingyu wang
02:59 AM Bug #53757 (Fix Under Review): I have a rados object that data size is 0, and this object have a ...
Env:ceph version is 10.2.9, os is rhel7.8,and kernerl version is ' 3.13.0-86-generic'
1、cereat some rados objects ...
xingyu wang

01/02/2022

06:58 PM Bug #53751: "N monitors have not enabled msgr2" is always shown for new clusters
Another thing I don't understand from the docs:
https://docs.ceph.com/en/pacific/rados/configuration/msgr2/#transi...
Niklas Hambuechen
06:34 PM Bug #53751 (Need More Info): "N monitors have not enabled msgr2" is always shown for new clusters
I am experiencing that for new clusters (currently Ceph 16.2.7), `ceph status` always shows e.g.:
3 monitors h...
Niklas Hambuechen

12/31/2021

01:11 PM Bug #48468: ceph-osd crash before being up again
Hey Gonzalo,
It was some times ago but from my memories I've created a hudge swapfile ~50G and I restarted the os...
Clément Hampaï
04:15 AM Bug #53749 (New): ceph device scrape-health-metrics truncates smartctl/nvme output to 100 KiB
* https://github.com/ceph/ceph/blob/ae17c0a0c319c42d822e4618fd0d1c52c9b07ed1/src/common/blkdev.cc#L729
* https://git...
Niklas Hambuechen

12/30/2021

11:42 PM Bug #51626: OSD uses all host memory (80g) on startup due to pg_split
Gonzalo Aguilar Delgado wrote:
> Any update? I have the same trouble...
>
> I downgraded kernel to 4.XX because w...
Tor Martin Ølberg
01:17 PM Bug #51626: OSD uses all host memory (80g) on startup due to pg_split
Any update? I have the same trouble...
I downgraded kernel to 4.XX because with newer kernel I cannot even get thi...
Gonzalo Aguilar Delgado

12/29/2021

08:00 AM Bug #53740 (Resolved): mon: all mon daemon always crash after rm pool
We have an openstack cluster.Last week we started clearing all openstack instances and deleting all ceph pools.All mo... Taizeng Wu

12/28/2021

04:25 AM Bug #50775: mds and osd unable to obtain rotating service keys
We ran into this issue, too. Our environment is a multi-host cluster (v15.2.9). Sometimes, we can observe that "unabl... Jerry Pu
 

Also available in: Atom