Activity
From 12/06/2021 to 01/04/2022
01/04/2022
- 11:31 PM Backport #53769 (Resolved): pacific: [ceph osd set noautoscale] Global on/off flag for PG autosca...
- 11:26 PM Feature #51213 (Pending Backport): [ceph osd set noautoscale] Global on/off flag for PG autoscale...
- 09:49 PM Bug #53768 (New): timed out waiting for admin_socket to appear after osd.2 restart in thrasher/de...
- Error snippet:
2022-01-02T01:37:09.296 DEBUG:teuthology.orchestra.run.smithi086:> sudo adjust-ulimits ceph-coverag... - 09:35 PM Bug #53767 (Duplicate): qa/workunits/cls/test_cls_2pc_queue.sh: killing an osd during thrashing c...
- Description: rados/verify/{centos_latest ceph clusters/{fixed-2 openstack} d-thrash/default/{default thrashosds-healt...
- 06:14 PM Bug #51945: qa/workunits/mon/caps.sh: Error: Expected return 13, got 0
- /a/yuriw-2021-12-23_16:50:03-rados-wip-yuri6-testing-2021-12-22-1410-distro-default-smithi/6582413...
- 12:37 PM Bug #23827: osd sends op_reply out of order
- Here is the log information on my environment.
h3. op1 is coming and send shards to peers osd
2021-12-01 18:27:... - 03:11 AM Bug #53757: I have a rados object that data size is 0, and this object have a large amount of oma...
- pr:https://github.com/ceph/ceph/pull/44450
- 02:59 AM Bug #53757 (Fix Under Review): I have a rados object that data size is 0, and this object have a ...
- Env:ceph version is 10.2.9, os is rhel7.8,and kernerl version is ' 3.13.0-86-generic'
1、cereat some rados objects ...
01/02/2022
- 06:58 PM Bug #53751: "N monitors have not enabled msgr2" is always shown for new clusters
- Another thing I don't understand from the docs:
https://docs.ceph.com/en/pacific/rados/configuration/msgr2/#transi... - 06:34 PM Bug #53751 (Need More Info): "N monitors have not enabled msgr2" is always shown for new clusters
- I am experiencing that for new clusters (currently Ceph 16.2.7), `ceph status` always shows e.g.:
3 monitors h...
12/31/2021
- 01:11 PM Bug #48468: ceph-osd crash before being up again
- Hey Gonzalo,
It was some times ago but from my memories I've created a hudge swapfile ~50G and I restarted the os... - 04:15 AM Bug #53749 (New): ceph device scrape-health-metrics truncates smartctl/nvme output to 100 KiB
- * https://github.com/ceph/ceph/blob/ae17c0a0c319c42d822e4618fd0d1c52c9b07ed1/src/common/blkdev.cc#L729
* https://git...
12/30/2021
- 11:42 PM Bug #51626: OSD uses all host memory (80g) on startup due to pg_split
- Gonzalo Aguilar Delgado wrote:
> Any update? I have the same trouble...
>
> I downgraded kernel to 4.XX because w... - 01:17 PM Bug #51626: OSD uses all host memory (80g) on startup due to pg_split
- Any update? I have the same trouble...
I downgraded kernel to 4.XX because with newer kernel I cannot even get thi...
12/29/2021
- 08:00 AM Bug #53740 (Resolved): mon: all mon daemon always crash after rm pool
- We have an openstack cluster.Last week we started clearing all openstack instances and deleting all ceph pools.All mo...
12/28/2021
- 04:25 AM Bug #50775: mds and osd unable to obtain rotating service keys
- We ran into this issue, too. Our environment is a multi-host cluster (v15.2.9). Sometimes, we can observe that "unabl...
12/27/2021
- 12:34 AM Bug #53729: ceph-osd takes all memory before oom on boot
- Could you please set debug-osd to 5/20 and share relevant OSD startup log?
12/25/2021
- 09:30 PM Bug #53729 (Resolved): ceph-osd takes all memory before oom on boot
- Hi, I cannot boot half of my OSD all of them die by oom killed.
It seems they are taking all the memory. Everythi... - 09:03 PM Bug #48468: ceph-osd crash before being up again
- Clément Hampaï wrote:
> Hi Sage,
>
> Hum I've finally managed to recover my cluster after an uncounted osd resta... - 09:03 PM Bug #48468: ceph-osd crash before being up again
- Hi I'm having the same problem.
-7> 2021-12-25T12:05:37.491+0100 7fd15c920640 1 heartbeat_map reset_timeout '...
12/23/2021
- 07:28 PM Bug #52925 (Closed): pg peering alway after trigger async recovery
- As per https://github.com/ceph/ceph/pull/43534#issuecomment-984252587
- 07:27 PM Backport #53721 (Resolved): octopus: common: admin socket compiler warning
- 07:27 PM Backport #53720 (Resolved): pacific: common: admin socket compiler warning
- 07:25 PM Backport #53719 (Resolved): octopus: mon: frequent cpu_tp had timed out messages
- https://github.com/ceph/ceph/pull/44546
- 07:25 PM Backport #53718 (Resolved): pacific: mon: frequent cpu_tp had timed out messages
- https://github.com/ceph/ceph/pull/44545
- 07:24 PM Bug #43266 (Pending Backport): common: admin socket compiler warning
- 07:21 PM Bug #53506 (Pending Backport): mon: frequent cpu_tp had timed out messages
- 07:17 PM Bug #52124: Invalid read of size 8 in handle_recovery_delete()
- /a/yuriw-2021-12-22_22:11:35-rados-wip-yuri3-testing-2021-12-22-1047-distro-default-smithi/6580187
- 07:14 PM Bug #52124: Invalid read of size 8 in handle_recovery_delete()
- /a/yuriw-2021-12-22_22:11:35-rados-wip-yuri3-testing-2021-12-22-1047-distro-default-smithi/6580436
- 02:07 PM Bug #52509: PG merge: PG stuck in premerge+peered state
- Neha Ojha wrote:
> Konstantin Shalygin wrote:
> > We can plan and spent time to setup staging cluster for this and ... - 10:17 AM Bug #52509: PG merge: PG stuck in premerge+peered state
- @Markus just for the record, what is your Ceph version? And what is your hardware for OSD's? The actual issue was on ...
- 10:45 AM Bug #53327: osd: osd_fast_shutdown_notify_mon not quite right and enable osd_fast_shutdown_notify...
- Hi Sage,
is there some update?
- 09:32 AM Backport #53701 (In Progress): octopus: qa/tasks/backfill_toofull.py: AssertionError: 2.0 not in ...
- PR: https://github.com/ceph/ceph/pull/43438
- 09:25 AM Backport #53702 (In Progress): pacific: qa/tasks/backfill_toofull.py: AssertionError: 2.0 not in ...
- 03:05 AM Bug #49689: osd/PeeringState.cc: ceph_abort_msg("past_interval start interval mismatch") start
- 匿名用户 wrote:
> Neha Ojha wrote:
> > [...]
> >
> > looks like this pg had the same pi when it got created
> >
>...
12/22/2021
- 05:25 PM Backport #53702 (Resolved): pacific: qa/tasks/backfill_toofull.py: AssertionError: 2.0 not in bac...
- https://github.com/ceph/ceph/pull/44387
- 05:25 PM Backport #53701 (Resolved): octopus: qa/tasks/backfill_toofull.py: AssertionError: 2.0 not in bac...
- https://github.com/ceph/ceph/pull/43438
- 05:23 PM Bug #53677 (Pending Backport): qa/tasks/backfill_toofull.py: AssertionError: 2.0 not in backfilling
- 04:06 PM Bug #53677 (Fix Under Review): qa/tasks/backfill_toofull.py: AssertionError: 2.0 not in backfilling
- 03:44 PM Bug #47589: radosbench times out "reached maximum tries (800) after waiting for 4800 seconds"
- /a/yuriw-2021-12-21_18:01:07-rados-wip-yuri3-testing-2021-12-21-0749-distro-default-smithi/6576331/
- 11:53 AM Bug #53663: Random scrub errors (omap_digest_mismatch) on pgs of RADOSGW metadata pools
- I asked on the ML about this issue - see thread here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/messag...
- 03:40 AM Bug #49689: osd/PeeringState.cc: ceph_abort_msg("past_interval start interval mismatch") start
- Neha Ojha wrote:
> [...]
>
> looks like this pg had the same pi when it got created
>
> [...]
>
> but get_r...
12/21/2021
- 05:37 PM Bug #53677 (In Progress): qa/tasks/backfill_toofull.py: AssertionError: 2.0 not in backfilling
- This looks like a false test failure due to the test setting backfillfull ratio too low.
We see that the test calc... - 11:22 AM Bug #53685 (New): Assertion `HAVE_FEATURE(features, SERVER_OCTOPUS)' failed.
- Test "rados/verify/{centos_latest ceph clusters/{fixed-2 openstack} d-thrash/default/{default thrashosds-health} mon_...
12/20/2021
- 05:30 PM Bug #53677 (Resolved): qa/tasks/backfill_toofull.py: AssertionError: 2.0 not in backfilling
- ...
- 11:51 AM Bug #23827: osd sends op_reply out of order
- This bug occurred in my online environment(Nautilus 14.2.5) some days ago and my application exited because client’s ...
- 09:10 AM Bug #53667: osd cannot be started after being set to stop
- fix in https://github.com/ceph/ceph/pull/44363
- 08:55 AM Bug #53667 (Fix Under Review): osd cannot be started after being set to stop
- after setting osd stop, osd cannot be pulled up again
[root@controller-2 ~]# ceph osd status
ID HOST US...
12/19/2021
- 07:56 PM Bug #44286: Cache tiering shows unfound objects after OSD reboots
- the problem still exists on 15.2.15.
I've also got replicated size 3 min_size 2.
the problem occurs only when one O... - 12:50 AM Bug #53663: Random scrub errors (omap_digest_mismatch) on pgs of RADOSGW metadata pools
- The only "special" settings I can think of are...
12/18/2021
- 11:16 PM Bug #53663 (Duplicate): Random scrub errors (omap_digest_mismatch) on pgs of RADOSGW metadata pools
- On a 4 node Octopus cluster I am randomly seeing batches of scrub errors, as in:...
12/17/2021
- 04:28 PM Bug #53485: monstore: logm entries are not garbage collected
- fix is in progress
- 03:07 PM Backport #53660 (Resolved): octopus: mon: "FAILED ceph_assert(session_map.sessions.empty())" when...
- https://github.com/ceph/ceph/pull/44544
- 03:07 PM Backport #53659 (Resolved): pacific: mon: "FAILED ceph_assert(session_map.sessions.empty())" when...
- https://github.com/ceph/ceph/pull/44543
- 03:00 PM Bug #39150 (Pending Backport): mon: "FAILED ceph_assert(session_map.sessions.empty())" when out o...
12/16/2021
- 11:24 PM Bug #53600 (Rejected): Crash in MOSDPGLog::encode_payload
- 11:12 PM Bug #53600: Crash in MOSDPGLog::encode_payload
- It should be noted there were a whole lot of oom-kill events on this node during the times these crashes occurred. Gi...
- 03:11 AM Bug #53600: Crash in MOSDPGLog::encode_payload
- The binaries running when these crashes were seen actually are from this wip branch in the ceph-ci repo.
https://s... - 05:55 PM Bug #53485: monstore: logm entries are not garbage collected
- I changed the paxos debug level to 20 and fond this in mon store log:...
- 03:36 PM Bug #53485: monstore: logm entries are not garbage collected
- We just grew to wopping 80 gb metadata server. I'm out ideas here and don't know how to stop the growth.
Somebody ad... - 04:35 PM Backport #53644 (Resolved): pacific: Disable health warning when autoscaler is on
- https://github.com/ceph/ceph/pull/45152
- 04:33 PM Bug #53516 (Pending Backport): Disable health warning when autoscaler is on
- 03:56 PM Bug #52189: crash in AsyncConnection::maybe_start_delay_thread()
- We observed a few more of those crashes. Six of them where just seconds or minutes apart or different osd / hosts eve...
- 03:45 PM Bug #39150 (Fix Under Review): mon: "FAILED ceph_assert(session_map.sessions.empty())" when out o...
12/15/2021
- 08:04 AM Bug #52488: Pacific mon won't join Octopus mons
- There is the same problem with migrating to Pacific from Nautilus
12/14/2021
- 10:02 PM Bug #50042: rados/test.sh: api_watch_notify failures
- ...
- 09:56 PM Bug #49524: ceph_test_rados_delete_pools_parallel didn't start
- ...
- 12:31 PM Bug #50657 (Resolved): smart query on monitors
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 12:29 PM Bug #52583 (Resolved): partial recovery become whole object recovery after restart osd
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 12:23 PM Backport #52450 (Resolved): pacific: smart query on monitors
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/44164
m... - 12:22 PM Backport #52451 (Resolved): octopus: smart query on monitors
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/44177
m... - 12:20 PM Backport #51149 (Resolved): octopus: When read failed, ret can not take as data len, in FillInVer...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/44174
m... - 12:20 PM Backport #51171 (Resolved): octopus: regression in ceph daemonperf command output, osd columns ar...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/44176
m... - 12:20 PM Backport #52710 (Resolved): octopus: partial recovery become whole object recovery after restart osd
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/44165
m... - 12:20 PM Backport #53389 (Resolved): octopus: pg-temp entries are not cleared for PGs that no longer exist
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/44097
m... - 08:37 AM Bug #53600 (Rejected): Crash in MOSDPGLog::encode_payload
- 3 OSDs crashed on the gibba cluster. All the OSDs were a part of gibba045 node.
*Observations:*
- osd.15 and os... - 01:22 AM Bug #53584: FAILED ceph_assert(pop.data.length() == sinfo.aligned_logical_offset_to_chunk_offset(...
- Neha Ojha wrote:
> ..., it seems like you have "enough copies available" to remove the problematic OSD but we won't ...
12/13/2021
- 10:56 PM Bug #52416 (Fix Under Review): devices: mon devices appear empty when scraping SMART metrics
- 10:48 PM Bug #53575 (Rejected): Valgrind reports memory "Leak_PossiblyLost" errors concerning lib64
- We could suppress this but since it is not coming from the Ceph code, rejecting it.
- 10:41 PM Bug #53584 (Need More Info): FAILED ceph_assert(pop.data.length() == sinfo.aligned_logical_offset...
- Can you provide OSD logs for the PG that is crashing (from all the shards)? From the error logs, it seems like you ha...
- 10:08 AM Bug #53593: RBD cloned image is slow in 4k write with "waiting for rw locks"
- [Observed Poor Performance]
On a rbd image, we found the 4k write IOPS is much lower than expected.
I understood th... - 10:05 AM Bug #53593 (Pending Backport): RBD cloned image is slow in 4k write with "waiting for rw locks"
- h1. [Observed Poor Performance]
On a rbd image, we found the 4k write IOPS is much lower than expected.
I understoo...
12/12/2021
- 01:39 PM Bug #53586 (New): rocksdb: build error with rocksdb-6.25.x
- Here we go, again, same bug as in #52415, affects all attempt to build ceph-16.2.7 against rocksdb-6.25-*
Cheers,
... - 08:49 AM Bug #53584 (Need More Info): FAILED ceph_assert(pop.data.length() == sinfo.aligned_logical_offset...
- ...
12/11/2021
- 04:15 PM Backport #51149: octopus: When read failed, ret can not take as data len, in FillInVerifyExtent
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/44174
meged
12/10/2021
- 11:46 PM Backport #51171: octopus: regression in ceph daemonperf command output, osd columns aren't visibl...
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/44176
merged - 11:43 PM Backport #52710: octopus: partial recovery become whole object recovery after restart osd
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/44165
merged - 11:43 PM Backport #53389: octopus: pg-temp entries are not cleared for PGs that no longer exist
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/44097
merged - 09:16 PM Bug #53516 (Fix Under Review): Disable health warning when autoscaler is on
- 06:03 PM Bug #52621: cephx: verify_authorizer could not decrypt ticket info: error: bad magic in decode_de...
- ...
12/09/2021
- 11:06 PM Bug #52136: Valgrind reports memory "Leak_DefinitelyLost" errors.
- /a/yuriw-2021-12-09_00:18:57-rados-wip-yuri-testing-2021-12-08-1336-distro-default-smithi/6553724/ ----> osd.1.log.gz
- 09:38 PM Bug #53575 (Resolved): Valgrind reports memory "Leak_PossiblyLost" errors concerning lib64
- Found in /a/yuriw-2021-12-09_00:18:57-rados-wip-yuri-testing-2021-12-08-1336-distro-default-smithi/6553724
The fol... - 04:32 PM Backport #53549 (In Progress): nautilus: [RFE] Provide warning when the 'require-osd-release' fla...
- 01:43 PM Backport #53550 (In Progress): octopus: [RFE] Provide warning when the 'require-osd-release' flag...
- 12:53 PM Backport #53551 (In Progress): pacific: [RFE] Provide warning when the 'require-osd-release' flag...
12/08/2021
- 09:15 PM Backport #53551 (Resolved): pacific: [RFE] Provide warning when the 'require-osd-release' flag do...
- https://github.com/ceph/ceph/pull/44259
- 09:15 PM Backport #53550 (Resolved): octopus: [RFE] Provide warning when the 'require-osd-release' flag do...
- https://github.com/ceph/ceph/pull/44260
- 09:15 PM Backport #53549 (Rejected): nautilus: [RFE] Provide warning when the 'require-osd-release' flag d...
- https://github.com/ceph/ceph/pull/44263
- 09:13 PM Feature #51984 (Pending Backport): [RFE] Provide warning when the 'require-osd-release' flag does...
- 07:08 PM Bug #51904: test_pool_min_size:AssertionError:wait_for_clean:failed before timeout expired due to...
- /a/yuriw-2021-12-07_16:04:59-rados-wip-yuri5-testing-2021-12-06-1619-distro-default-smithi/6551120
pg map right be... - 06:49 PM Bug #53544 (New): src/test/osd/RadosModel.h: ceph_abort_msg("racing read got wrong version") in t...
- ...
- 03:30 PM Bug #52124: Invalid read of size 8 in handle_recovery_delete()
- /a/yuriw-2021-12-07_16:02:55-rados-wip-yuri11-testing-2021-12-06-1619-distro-default-smithi/6550873
- 12:15 PM Backport #53535 (Resolved): pacific: mon: mgrstatmonitor spams mgr with service_map
- https://github.com/ceph/ceph/pull/44721
- 12:15 PM Backport #53534 (Resolved): octopus: mon: mgrstatmonitor spams mgr with service_map
- https://github.com/ceph/ceph/pull/44722
- 12:10 PM Bug #53479 (Pending Backport): mon: mgrstatmonitor spams mgr with service_map
12/07/2021
- 09:27 PM Bug #53516 (Resolved): Disable health warning when autoscaler is on
- the command:
ceph health detail
displays a warning when a pool has many more objects per pg than other pools. Thi...
12/06/2021
- 10:05 PM Backport #53507 (Duplicate): pacific: ceph -s mon quorum age negative number
- 10:03 PM Bug #53306 (Pending Backport): ceph -s mon quorum age negative number
- Needs to be included in https://github.com/ceph/ceph/pull/43698
- 08:42 PM Backport #52450: pacific: smart query on monitors
- Backport Bot wrote:
> https://github.com/ceph/ceph/pull/44164
merged - 06:13 PM Bug #53506 (Fix Under Review): mon: frequent cpu_tp had timed out messages
- 06:06 PM Bug #53506 (Closed): mon: frequent cpu_tp had timed out messages
- ...
- 11:06 AM Bug #52416: devices: mon devices appear empty when scraping SMART metrics
- If `ceph-mon` runs as a systemd unit, check if `PrivateDevices=yes` in `/lib/systemd/system/ceph-mon@.service`; if so...
- 10:30 AM Bug #53142: OSD crash in PG::do_delete_work when increasing PGs
- Ist Gab wrote:
> Igor Fedotov wrote:
> > …
>
> Igor, do you think if we put a super fast 2-4TB write optimized n... - 09:14 AM Bug #52189: crash in AsyncConnection::maybe_start_delay_thread()
- Neha Ojha wrote:
> We'll need more information to debug a crash like this.
@Nea, we observed another one of the... - 08:49 AM Bug #51307: LibRadosWatchNotify.Watch2Delete fails
- /a/yuriw-2021-12-03_15:27:18-rados-wip-yuri11-testing-2021-12-02-1451-distro-default-smithi/6542889...
- 08:25 AM Bug #53500: rte_eal_init fail will waiting forever
- r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /... - 08:20 AM Bug #53500 (New): rte_eal_init fail will waiting forever
- The rte_eal_init returns a failure message and does not wake up the waiting msgr-worker thread. As a result, the wait...
Also available in: Atom