Project

General

Profile

Activity

From 12/06/2021 to 01/04/2022

01/04/2022

11:31 PM Backport #53769 (Resolved): pacific: [ceph osd set noautoscale] Global on/off flag for PG autosca...
Backport Bot
11:26 PM Feature #51213 (Pending Backport): [ceph osd set noautoscale] Global on/off flag for PG autoscale...
Vikhyat Umrao
09:49 PM Bug #53768 (New): timed out waiting for admin_socket to appear after osd.2 restart in thrasher/de...
Error snippet:
2022-01-02T01:37:09.296 DEBUG:teuthology.orchestra.run.smithi086:> sudo adjust-ulimits ceph-coverag...
Joseph Sawaya
09:35 PM Bug #53767 (Duplicate): qa/workunits/cls/test_cls_2pc_queue.sh: killing an osd during thrashing c...
Description: rados/verify/{centos_latest ceph clusters/{fixed-2 openstack} d-thrash/default/{default thrashosds-healt... Laura Flores
06:14 PM Bug #51945: qa/workunits/mon/caps.sh: Error: Expected return 13, got 0
/a/yuriw-2021-12-23_16:50:03-rados-wip-yuri6-testing-2021-12-22-1410-distro-default-smithi/6582413... Laura Flores
12:37 PM Bug #23827: osd sends op_reply out of order
Here is the log information on my environment.
h3. op1 is coming and send shards to peers osd
2021-12-01 18:27:...
Ivan Guan
03:11 AM Bug #53757: I have a rados object that data size is 0, and this object have a large amount of oma...
pr:https://github.com/ceph/ceph/pull/44450 xingyu wang
02:59 AM Bug #53757 (Fix Under Review): I have a rados object that data size is 0, and this object have a ...
Env:ceph version is 10.2.9, os is rhel7.8,and kernerl version is ' 3.13.0-86-generic'
1、cereat some rados objects ...
xingyu wang

01/02/2022

06:58 PM Bug #53751: "N monitors have not enabled msgr2" is always shown for new clusters
Another thing I don't understand from the docs:
https://docs.ceph.com/en/pacific/rados/configuration/msgr2/#transi...
Niklas Hambuechen
06:34 PM Bug #53751 (Need More Info): "N monitors have not enabled msgr2" is always shown for new clusters
I am experiencing that for new clusters (currently Ceph 16.2.7), `ceph status` always shows e.g.:
3 monitors h...
Niklas Hambuechen

12/31/2021

01:11 PM Bug #48468: ceph-osd crash before being up again
Hey Gonzalo,
It was some times ago but from my memories I've created a hudge swapfile ~50G and I restarted the os...
Clément Hampaï
04:15 AM Bug #53749 (New): ceph device scrape-health-metrics truncates smartctl/nvme output to 100 KiB
* https://github.com/ceph/ceph/blob/ae17c0a0c319c42d822e4618fd0d1c52c9b07ed1/src/common/blkdev.cc#L729
* https://git...
Niklas Hambuechen

12/30/2021

11:42 PM Bug #51626: OSD uses all host memory (80g) on startup due to pg_split
Gonzalo Aguilar Delgado wrote:
> Any update? I have the same trouble...
>
> I downgraded kernel to 4.XX because w...
Tor Martin Ølberg
01:17 PM Bug #51626: OSD uses all host memory (80g) on startup due to pg_split
Any update? I have the same trouble...
I downgraded kernel to 4.XX because with newer kernel I cannot even get thi...
Gonzalo Aguilar Delgado

12/29/2021

08:00 AM Bug #53740 (Resolved): mon: all mon daemon always crash after rm pool
We have an openstack cluster.Last week we started clearing all openstack instances and deleting all ceph pools.All mo... Taizeng Wu

12/28/2021

04:25 AM Bug #50775: mds and osd unable to obtain rotating service keys
We ran into this issue, too. Our environment is a multi-host cluster (v15.2.9). Sometimes, we can observe that "unabl... Jerry Pu

12/27/2021

12:34 AM Bug #53729: ceph-osd takes all memory before oom on boot
Could you please set debug-osd to 5/20 and share relevant OSD startup log? Igor Fedotov

12/25/2021

09:30 PM Bug #53729 (Resolved): ceph-osd takes all memory before oom on boot
Hi, I cannot boot half of my OSD all of them die by oom killed.
It seems they are taking all the memory. Everythi...
Gonzalo Aguilar Delgado
09:03 PM Bug #48468: ceph-osd crash before being up again
Clément Hampaï wrote:
> Hi Sage,
>
> Hum I've finally managed to recover my cluster after an uncounted osd resta...
Gonzalo Aguilar Delgado
09:03 PM Bug #48468: ceph-osd crash before being up again
Hi I'm having the same problem.
-7> 2021-12-25T12:05:37.491+0100 7fd15c920640 1 heartbeat_map reset_timeout '...
Gonzalo Aguilar Delgado

12/23/2021

07:28 PM Bug #52925 (Closed): pg peering alway after trigger async recovery
As per https://github.com/ceph/ceph/pull/43534#issuecomment-984252587 Neha Ojha
07:27 PM Backport #53721 (Resolved): octopus: common: admin socket compiler warning
Backport Bot
07:27 PM Backport #53720 (Resolved): pacific: common: admin socket compiler warning
Backport Bot
07:25 PM Backport #53719 (Resolved): octopus: mon: frequent cpu_tp had timed out messages
https://github.com/ceph/ceph/pull/44546 Backport Bot
07:25 PM Backport #53718 (Resolved): pacific: mon: frequent cpu_tp had timed out messages
https://github.com/ceph/ceph/pull/44545 Backport Bot
07:24 PM Bug #43266 (Pending Backport): common: admin socket compiler warning
Neha Ojha
07:21 PM Bug #53506 (Pending Backport): mon: frequent cpu_tp had timed out messages
Neha Ojha
07:17 PM Bug #52124: Invalid read of size 8 in handle_recovery_delete()
/a/yuriw-2021-12-22_22:11:35-rados-wip-yuri3-testing-2021-12-22-1047-distro-default-smithi/6580187 Laura Flores
07:14 PM Bug #52124: Invalid read of size 8 in handle_recovery_delete()
/a/yuriw-2021-12-22_22:11:35-rados-wip-yuri3-testing-2021-12-22-1047-distro-default-smithi/6580436 Laura Flores
02:07 PM Bug #52509: PG merge: PG stuck in premerge+peered state
Neha Ojha wrote:
> Konstantin Shalygin wrote:
> > We can plan and spent time to setup staging cluster for this and ...
Konstantin Shalygin
10:17 AM Bug #52509: PG merge: PG stuck in premerge+peered state
@Markus just for the record, what is your Ceph version? And what is your hardware for OSD's? The actual issue was on ... Konstantin Shalygin
10:45 AM Bug #53327: osd: osd_fast_shutdown_notify_mon not quite right and enable osd_fast_shutdown_notify...
Hi Sage,
is there some update?
Manuel Lausch
09:32 AM Backport #53701 (In Progress): octopus: qa/tasks/backfill_toofull.py: AssertionError: 2.0 not in ...
PR: https://github.com/ceph/ceph/pull/43438 Mykola Golub
09:25 AM Backport #53702 (In Progress): pacific: qa/tasks/backfill_toofull.py: AssertionError: 2.0 not in ...
Mykola Golub
03:05 AM Bug #49689: osd/PeeringState.cc: ceph_abort_msg("past_interval start interval mismatch") start
匿名用户 wrote:
> Neha Ojha wrote:
> > [...]
> >
> > looks like this pg had the same pi when it got created
> >
>...
Shu Yu

12/22/2021

05:25 PM Backport #53702 (Resolved): pacific: qa/tasks/backfill_toofull.py: AssertionError: 2.0 not in bac...
https://github.com/ceph/ceph/pull/44387 Backport Bot
05:25 PM Backport #53701 (Resolved): octopus: qa/tasks/backfill_toofull.py: AssertionError: 2.0 not in bac...
https://github.com/ceph/ceph/pull/43438 Backport Bot
05:23 PM Bug #53677 (Pending Backport): qa/tasks/backfill_toofull.py: AssertionError: 2.0 not in backfilling
Neha Ojha
04:06 PM Bug #53677 (Fix Under Review): qa/tasks/backfill_toofull.py: AssertionError: 2.0 not in backfilling
Mykola Golub
03:44 PM Bug #47589: radosbench times out "reached maximum tries (800) after waiting for 4800 seconds"
/a/yuriw-2021-12-21_18:01:07-rados-wip-yuri3-testing-2021-12-21-0749-distro-default-smithi/6576331/ Kamoltat (Junior) Sirivadhna
11:53 AM Bug #53663: Random scrub errors (omap_digest_mismatch) on pgs of RADOSGW metadata pools
I asked on the ML about this issue - see thread here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/messag... Christian Rohmann
03:40 AM Bug #49689: osd/PeeringState.cc: ceph_abort_msg("past_interval start interval mismatch") start
Neha Ojha wrote:
> [...]
>
> looks like this pg had the same pi when it got created
>
> [...]
>
> but get_r...
Anonymous

12/21/2021

05:37 PM Bug #53677 (In Progress): qa/tasks/backfill_toofull.py: AssertionError: 2.0 not in backfilling
This looks like a false test failure due to the test setting backfillfull ratio too low.
We see that the test calc...
Mykola Golub
11:22 AM Bug #53685 (New): Assertion `HAVE_FEATURE(features, SERVER_OCTOPUS)' failed.
Test "rados/verify/{centos_latest ceph clusters/{fixed-2 openstack} d-thrash/default/{default thrashosds-health} mon_... Adam Kupczyk

12/20/2021

05:30 PM Bug #53677 (Resolved): qa/tasks/backfill_toofull.py: AssertionError: 2.0 not in backfilling
... Neha Ojha
11:51 AM Bug #23827: osd sends op_reply out of order
This bug occurred in my online environment(Nautilus 14.2.5) some days ago and my application exited because client’s ... Ivan Guan
09:10 AM Bug #53667: osd cannot be started after being set to stop
fix in https://github.com/ceph/ceph/pull/44363 changzhi tan
08:55 AM Bug #53667 (Fix Under Review): osd cannot be started after being set to stop
after setting osd stop, osd cannot be pulled up again
[root@controller-2 ~]# ceph osd status
ID HOST US...
changzhi tan

12/19/2021

07:56 PM Bug #44286: Cache tiering shows unfound objects after OSD reboots
the problem still exists on 15.2.15.
I've also got replicated size 3 min_size 2.
the problem occurs only when one O...
marek czardybon
12:50 AM Bug #53663: Random scrub errors (omap_digest_mismatch) on pgs of RADOSGW metadata pools
The only "special" settings I can think of are... Christian Rohmann

12/18/2021

11:16 PM Bug #53663 (Duplicate): Random scrub errors (omap_digest_mismatch) on pgs of RADOSGW metadata pools
On a 4 node Octopus cluster I am randomly seeing batches of scrub errors, as in:... Christian Rohmann

12/17/2021

04:28 PM Bug #53485: monstore: logm entries are not garbage collected
fix is in progress Daniel Poelzleithner
03:07 PM Backport #53660 (Resolved): octopus: mon: "FAILED ceph_assert(session_map.sessions.empty())" when...
https://github.com/ceph/ceph/pull/44544 Backport Bot
03:07 PM Backport #53659 (Resolved): pacific: mon: "FAILED ceph_assert(session_map.sessions.empty())" when...
https://github.com/ceph/ceph/pull/44543 Backport Bot
03:00 PM Bug #39150 (Pending Backport): mon: "FAILED ceph_assert(session_map.sessions.empty())" when out o...
Sage Weil

12/16/2021

11:24 PM Bug #53600 (Rejected): Crash in MOSDPGLog::encode_payload
Brad Hubbard
11:12 PM Bug #53600: Crash in MOSDPGLog::encode_payload
It should be noted there were a whole lot of oom-kill events on this node during the times these crashes occurred. Gi... Brad Hubbard
03:11 AM Bug #53600: Crash in MOSDPGLog::encode_payload
The binaries running when these crashes were seen actually are from this wip branch in the ceph-ci repo.
https://s...
Brad Hubbard
05:55 PM Bug #53485: monstore: logm entries are not garbage collected
I changed the paxos debug level to 20 and fond this in mon store log:... Daniel Poelzleithner
03:36 PM Bug #53485: monstore: logm entries are not garbage collected
We just grew to wopping 80 gb metadata server. I'm out ideas here and don't know how to stop the growth.
Somebody ad...
Daniel Poelzleithner
04:35 PM Backport #53644 (Resolved): pacific: Disable health warning when autoscaler is on
https://github.com/ceph/ceph/pull/45152 Backport Bot
04:33 PM Bug #53516 (Pending Backport): Disable health warning when autoscaler is on
Neha Ojha
03:56 PM Bug #52189: crash in AsyncConnection::maybe_start_delay_thread()
We observed a few more of those crashes. Six of them where just seconds or minutes apart or different osd / hosts eve... Christian Rohmann
03:45 PM Bug #39150 (Fix Under Review): mon: "FAILED ceph_assert(session_map.sessions.empty())" when out o...
Sage Weil

12/15/2021

08:04 AM Bug #52488: Pacific mon won't join Octopus mons
There is the same problem with migrating to Pacific from Nautilus Michael Uleysky

12/14/2021

10:02 PM Bug #50042: rados/test.sh: api_watch_notify failures
... Neha Ojha
09:56 PM Bug #49524: ceph_test_rados_delete_pools_parallel didn't start
... Neha Ojha
12:31 PM Bug #50657 (Resolved): smart query on monitors
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Loïc Dachary
12:29 PM Bug #52583 (Resolved): partial recovery become whole object recovery after restart osd
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Loïc Dachary
12:23 PM Backport #52450 (Resolved): pacific: smart query on monitors
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/44164
m...
Loïc Dachary
12:22 PM Backport #52451 (Resolved): octopus: smart query on monitors
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/44177
m...
Loïc Dachary
12:20 PM Backport #51149 (Resolved): octopus: When read failed, ret can not take as data len, in FillInVer...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/44174
m...
Loïc Dachary
12:20 PM Backport #51171 (Resolved): octopus: regression in ceph daemonperf command output, osd columns ar...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/44176
m...
Loïc Dachary
12:20 PM Backport #52710 (Resolved): octopus: partial recovery become whole object recovery after restart osd
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/44165
m...
Loïc Dachary
12:20 PM Backport #53389 (Resolved): octopus: pg-temp entries are not cleared for PGs that no longer exist
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/44097
m...
Loïc Dachary
08:37 AM Bug #53600 (Rejected): Crash in MOSDPGLog::encode_payload
3 OSDs crashed on the gibba cluster. All the OSDs were a part of gibba045 node.
*Observations:*
- osd.15 and os...
Sridhar Seshasayee
01:22 AM Bug #53584: FAILED ceph_assert(pop.data.length() == sinfo.aligned_logical_offset_to_chunk_offset(...
Neha Ojha wrote:
> ..., it seems like you have "enough copies available" to remove the problematic OSD but we won't ...
玮文 胡

12/13/2021

10:56 PM Bug #52416 (Fix Under Review): devices: mon devices appear empty when scraping SMART metrics
Neha Ojha
10:48 PM Bug #53575 (Rejected): Valgrind reports memory "Leak_PossiblyLost" errors concerning lib64
We could suppress this but since it is not coming from the Ceph code, rejecting it. Neha Ojha
10:41 PM Bug #53584 (Need More Info): FAILED ceph_assert(pop.data.length() == sinfo.aligned_logical_offset...
Can you provide OSD logs for the PG that is crashing (from all the shards)? From the error logs, it seems like you ha... Neha Ojha
10:08 AM Bug #53593: RBD cloned image is slow in 4k write with "waiting for rw locks"
[Observed Poor Performance]
On a rbd image, we found the 4k write IOPS is much lower than expected.
I understood th...
Cuicui Zhao
10:05 AM Bug #53593 (Pending Backport): RBD cloned image is slow in 4k write with "waiting for rw locks"
h1. [Observed Poor Performance]
On a rbd image, we found the 4k write IOPS is much lower than expected.
I understoo...
Cuicui Zhao

12/12/2021

01:39 PM Bug #53586 (New): rocksdb: build error with rocksdb-6.25.x
Here we go, again, same bug as in #52415, affects all attempt to build ceph-16.2.7 against rocksdb-6.25-*
Cheers,
...
chris denice
08:49 AM Bug #53584 (Need More Info): FAILED ceph_assert(pop.data.length() == sinfo.aligned_logical_offset...
... 玮文 胡

12/11/2021

04:15 PM Backport #51149: octopus: When read failed, ret can not take as data len, in FillInVerifyExtent
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/44174
meged
Yuri Weinstein

12/10/2021

11:46 PM Backport #51171: octopus: regression in ceph daemonperf command output, osd columns aren't visibl...
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/44176
merged
Yuri Weinstein
11:43 PM Backport #52710: octopus: partial recovery become whole object recovery after restart osd
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/44165
merged
Yuri Weinstein
11:43 PM Backport #53389: octopus: pg-temp entries are not cleared for PGs that no longer exist
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/44097
merged
Yuri Weinstein
09:16 PM Bug #53516 (Fix Under Review): Disable health warning when autoscaler is on
Neha Ojha
06:03 PM Bug #52621: cephx: verify_authorizer could not decrypt ticket info: error: bad magic in decode_de...
... Neha Ojha

12/09/2021

11:06 PM Bug #52136: Valgrind reports memory "Leak_DefinitelyLost" errors.
/a/yuriw-2021-12-09_00:18:57-rados-wip-yuri-testing-2021-12-08-1336-distro-default-smithi/6553724/ ----> osd.1.log.gz Laura Flores
09:38 PM Bug #53575 (Resolved): Valgrind reports memory "Leak_PossiblyLost" errors concerning lib64
Found in /a/yuriw-2021-12-09_00:18:57-rados-wip-yuri-testing-2021-12-08-1336-distro-default-smithi/6553724
The fol...
Laura Flores
04:32 PM Backport #53549 (In Progress): nautilus: [RFE] Provide warning when the 'require-osd-release' fla...
Sridhar Seshasayee
01:43 PM Backport #53550 (In Progress): octopus: [RFE] Provide warning when the 'require-osd-release' flag...
Sridhar Seshasayee
12:53 PM Backport #53551 (In Progress): pacific: [RFE] Provide warning when the 'require-osd-release' flag...
Sridhar Seshasayee

12/08/2021

09:15 PM Backport #53551 (Resolved): pacific: [RFE] Provide warning when the 'require-osd-release' flag do...
https://github.com/ceph/ceph/pull/44259 Backport Bot
09:15 PM Backport #53550 (Resolved): octopus: [RFE] Provide warning when the 'require-osd-release' flag do...
https://github.com/ceph/ceph/pull/44260 Backport Bot
09:15 PM Backport #53549 (Rejected): nautilus: [RFE] Provide warning when the 'require-osd-release' flag d...
https://github.com/ceph/ceph/pull/44263 Backport Bot
09:13 PM Feature #51984 (Pending Backport): [RFE] Provide warning when the 'require-osd-release' flag does...
Neha Ojha
07:08 PM Bug #51904: test_pool_min_size:AssertionError:wait_for_clean:failed before timeout expired due to...
/a/yuriw-2021-12-07_16:04:59-rados-wip-yuri5-testing-2021-12-06-1619-distro-default-smithi/6551120
pg map right be...
Neha Ojha
06:49 PM Bug #53544 (New): src/test/osd/RadosModel.h: ceph_abort_msg("racing read got wrong version") in t...
... Neha Ojha
03:30 PM Bug #52124: Invalid read of size 8 in handle_recovery_delete()
/a/yuriw-2021-12-07_16:02:55-rados-wip-yuri11-testing-2021-12-06-1619-distro-default-smithi/6550873 Sridhar Seshasayee
12:15 PM Backport #53535 (Resolved): pacific: mon: mgrstatmonitor spams mgr with service_map
https://github.com/ceph/ceph/pull/44721 Backport Bot
12:15 PM Backport #53534 (Resolved): octopus: mon: mgrstatmonitor spams mgr with service_map
https://github.com/ceph/ceph/pull/44722 Backport Bot
12:10 PM Bug #53479 (Pending Backport): mon: mgrstatmonitor spams mgr with service_map
Sage Weil

12/07/2021

09:27 PM Bug #53516 (Resolved): Disable health warning when autoscaler is on
the command:
ceph health detail
displays a warning when a pool has many more objects per pg than other pools. Thi...
Christopher Hoffman

12/06/2021

10:05 PM Backport #53507 (Duplicate): pacific: ceph -s mon quorum age negative number
Backport Bot
10:03 PM Bug #53306 (Pending Backport): ceph -s mon quorum age negative number
Needs to be included in https://github.com/ceph/ceph/pull/43698 Neha Ojha
08:42 PM Backport #52450: pacific: smart query on monitors
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/44164
merged
Yuri Weinstein
06:13 PM Bug #53506 (Fix Under Review): mon: frequent cpu_tp had timed out messages
Sage Weil
06:06 PM Bug #53506 (Closed): mon: frequent cpu_tp had timed out messages
... Sage Weil
11:06 AM Bug #52416: devices: mon devices appear empty when scraping SMART metrics
If `ceph-mon` runs as a systemd unit, check if `PrivateDevices=yes` in `/lib/systemd/system/ceph-mon@.service`; if so... Benoît Knecht
10:30 AM Bug #53142: OSD crash in PG::do_delete_work when increasing PGs
Ist Gab wrote:
> Igor Fedotov wrote:
> > …
>
> Igor, do you think if we put a super fast 2-4TB write optimized n...
Igor Fedotov
09:14 AM Bug #52189: crash in AsyncConnection::maybe_start_delay_thread()
Neha Ojha wrote:
> We'll need more information to debug a crash like this.
@Nea, we observed another one of the...
Christian Rohmann
08:49 AM Bug #51307: LibRadosWatchNotify.Watch2Delete fails
/a/yuriw-2021-12-03_15:27:18-rados-wip-yuri11-testing-2021-12-02-1451-distro-default-smithi/6542889... Sridhar Seshasayee
08:25 AM Bug #53500: rte_eal_init fail will waiting forever
r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /...
chunsong feng
08:20 AM Bug #53500 (New): rte_eal_init fail will waiting forever
The rte_eal_init returns a failure message and does not wake up the waiting msgr-worker thread. As a result, the wait... chunsong feng
 

Also available in: Atom