Project

General

Profile

Activity

From 09/29/2021 to 10/28/2021

10/28/2021

11:29 PM Feature #51213: [ceph osd set noautoscale] Global on/off flag for PG autoscale feature
PR: https://github.com/ceph/ceph/pull/43716 Kamoltat (Junior) Sirivadhna
03:29 PM Backport #52845 (In Progress): pacific: osd: add scrub duration to pg dump
Cory Snyder
02:12 PM Bug #51942: src/osd/scrub_machine.cc: FAILED ceph_assert(state_cast<const NotActive*>())
and /a/sage-2021-10-28_02:19:01-rados-wip-sage3-testing-2021-10-27-1300-distro-basic-smithi/6464056
with logs
Sage Weil
02:10 PM Bug #51942: src/osd/scrub_machine.cc: FAILED ceph_assert(state_cast<const NotActive*>())
/a/sage-2021-10-28_02:19:01-rados-wip-sage3-testing-2021-10-27-1300-distro-basic-smithi/6464393
with osd logs
Sage Weil
02:08 PM Bug #39150: mon: "FAILED ceph_assert(session_map.sessions.empty())" when out of quorum
/a/sage-2021-10-28_02:19:01-rados-wip-sage3-testing-2021-10-27-1300-distro-basic-smithi/6464204
with logs!
Sage Weil
02:06 PM Bug #24990 (Fix Under Review): api_watch_notify: LibRadosWatchNotify.Watch3Timeout failed
Sage Weil
02:04 PM Bug #24990: api_watch_notify: LibRadosWatchNotify.Watch3Timeout failed
/a/sage-2021-10-28_02:19:01-rados-wip-sage3-testing-2021-10-27-1300-distro-basic-smithi/6464087... Sage Weil
11:56 AM Feature #52424 (In Progress): [RFE] Limit slow request details to mgr log
Prashant D

10/27/2021

07:55 PM Feature #53050: Support blocklisting a CIDR range
Greg Farnum wrote:
> Patrick Donnelly wrote:
> > So we're going to put a huge asterisk here that the CIDR range of ...
Patrick Donnelly
05:00 AM Feature #53050: Support blocklisting a CIDR range
Patrick Donnelly wrote:
> So we're going to put a huge asterisk here that the CIDR range of machines must be hard-re...
Greg Farnum
01:39 AM Feature #53050: Support blocklisting a CIDR range
So we're going to put a huge asterisk here that the CIDR range of machines must be hard-rebooted, right? Otherwise w... Patrick Donnelly
06:46 PM Bug #53067: Fix client "version" display for kernel clients
I looked at this at one point and it was moderately irritating, but the display bug is also really confusing for user... Greg Farnum
01:10 PM Bug #53067 (New): Fix client "version" display for kernel clients
Hello
When a rhel7 client mounts a cephfs share, it appears in `ceph features` as it was a jewel client, even if t...
gustavo panizzo
10:40 AM Bug #52509: PG merge: PG stuck in premerge+peered state
We had a similar outage.
We did try to increase the number of PGs on a bucket-index-pool:...
Markus Wennrich

10/26/2021

08:30 PM Backport #52770: pacific: pg scrub stat mismatch with special objects that have hash 'ffffffff'
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/43512
merged
Yuri Weinstein
08:29 PM Backport #52620: pacific: partial recovery become whole object recovery after restart osd
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/43513
merged
Yuri Weinstein
08:28 PM Backport #52843: pacific: msg/async/ProtocalV2: recv_stamp of a message is set to a wrong value
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/43511
merged
Yuri Weinstein
08:26 PM Backport #52831: pacific: osd: pg may get stuck in backfill_toofull after backfill is interrupted...
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/43437
merged
Yuri Weinstein
07:45 PM Bug #52124: Invalid read of size 8 in handle_recovery_delete()
/a/yuriw-2021-10-21_13:40:38-rados-wip-yuri2-testing-2021-10-20-1700-pacific-distro-basic-smithi/6454961/remote/smith... Neha Ojha
06:48 PM Feature #48590 (Rejected): Add ability to blocklist a cephx entity name, a set of entities by a l...
This is really impractical to do in RADOS. Closing in favor of https://tracker.ceph.com/issues/53050 Greg Farnum
06:48 PM Feature #53050 (Resolved): Support blocklisting a CIDR range
Disaster recovery use cases want to be able to fence off entire IP ranges, rather than needing to specify individual ... Greg Farnum
04:55 PM Bug #52761: OSDs announcing incorrect front_addr after upgrade to 16.2.6
Yes, I tried that, but it does not change the behavior:
>> ceph config set global public_network 10.113.0.0/16
...
Javier Cacheiro
04:35 PM Bug #52948: osd: fails to come up: "teuthology.misc:7 of 8 OSDs are up"
Neha Ojha wrote:
> This could be related to the removal of allocation metadata from rocksdb work from Gabi, I will h...
Patrick Donnelly

10/25/2021

09:34 PM Bug #52761: OSDs announcing incorrect front_addr after upgrade to 16.2.6
The docs suggest setting public_network in the global section, not just for the mons https://docs.ceph.com/en/latest/... Neha Ojha
09:27 PM Bug #52760 (Need More Info): Monitor unable to rejoin the cluster
Can you share mon logs from all the monitors with debug_mon=20 and debug_ms=1? Neha Ojha
09:26 PM Bug #52513: BlueStore.cc: 12391: ceph_abort_msg(\"unexpected error\") on operation 15
Added logs to teuthology:/post/tracker_52513/ Josh Durgin
09:20 PM Bug #52513: BlueStore.cc: 12391: ceph_abort_msg(\"unexpected error\") on operation 15
Konstantin Shalygin wrote:
> I was reproduced this issue:
>
> # put rados object to pool
> # get mapping for thi...
Neha Ojha
09:20 PM Bug #52513 (New): BlueStore.cc: 12391: ceph_abort_msg(\"unexpected error\") on operation 15
Neha Ojha
09:15 PM Bug #15546 (Resolved): json numerical output is between quotes
Based on https://tracker.ceph.com/issues/15546#note-2 (Thanks Laura!) Neha Ojha
09:13 PM Bug #52385 (Need More Info): a possible data loss due to recovery_unfound PG after restarting all...
Neha Ojha
09:13 PM Bug #27053: qa: thrashosds: "[ERR] : 2.0 has 1 objects unfound and apparently lost"
Deepika Upadhyay wrote:
> [...]
>
> /ceph/teuthology-archive/yuriw-2021-10-18_19:03:43-rados-wip-yuri5-testing-20...
Neha Ojha
09:06 PM Bug #44184: Slow / Hanging Ops after pool creation
Neha Ojha wrote:
> Which version are you using?
Octopus 15.2.14
Ist Gab
08:50 PM Bug #44184: Slow / Hanging Ops after pool creation
Ist Gab wrote:
> Wido den Hollander wrote:
> > On a cluster with 1405 OSDs I've ran into a situation for the second...
Neha Ojha
08:59 PM Bug #52948: osd: fails to come up: "teuthology.misc:7 of 8 OSDs are up"
This could be related to the removal of allocation metadata from rocksdb work from Gabi, I will have him verify.
...
Neha Ojha
06:28 AM Bug #53000: OSDMap/OSDMapTest.BUG_51842/2: ThreadPool::WorkQueue<ParallelPGMapper::Item>::_void_d...
Yet another example, now for a PR for the master branch [1, 2], and for OSDMap/OSDMapTest.BUG_51842/1:... Mykola Golub

10/21/2021

06:30 AM Bug #53000 (New): OSDMap/OSDMapTest.BUG_51842/2: ThreadPool::WorkQueue<ParallelPGMapper::Item>::_...
This failure was reported by Jenkins for pacific branch PR [1], though it does not look like related to that PR, and ... Mykola Golub
06:07 AM Bug #38219 (Resolved): rebuild-mondb hangs
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Loïc Dachary
05:53 AM Backport #52809 (Resolved): octopus: ceph-erasure-code-tool: new tool to encode/decode files
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/43407
m...
Loïc Dachary
05:52 AM Backport #51552 (Resolved): octopus: rebuild-mondb hangs
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/43263
m...
Loïc Dachary
05:49 AM Backport #51569 (Resolved): octopus: pool last_epoch_clean floor is stuck after pg merging
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/42837
m...
Loïc Dachary

10/20/2021

07:39 PM Bug #52993 (New): upgrade:octopus-x Test: Upgrade test failed due to timeout of the "ceph pg dump...
/a/teuthology-2021-10-12_13:10:21-upgrade:octopus-x-pacific-distro-basic-smithi/6433896
/a/teuthology-2021-10-12_13:...
Sridhar Seshasayee
07:24 PM Feature #52992: Enhance auto-repair capabilities to handle stat mismatch scrub errors
Downstream bug - https://bugzilla.redhat.com/show_bug.cgi?id=2010447 Vikhyat Umrao
07:24 PM Feature #52992 (New): Enhance auto-repair capabilities to handle stat mismatch scrub errors
At present the auto repair does not handle stat mismatch scrub errors.
- We plan to enhance the auto-repair capabi...
Vikhyat Umrao
06:30 PM Bug #52925 (Fix Under Review): pg peering alway after trigger async recovery
Neha Ojha
02:58 AM Bug #47838: mon/test_mon_osdmap_prune.sh: first_pinned != trim_to
... Deepika Upadhyay
02:55 AM Bug #27053: qa: thrashosds: "[ERR] : 2.0 has 1 objects unfound and apparently lost"
... Deepika Upadhyay

10/19/2021

08:55 PM Bug #52886: osd: backfill reservation does not take compression into account
Neha Ojha wrote:
> I'll create a trello card to track this, I think the initial toofull implementation was intention...
Neha Ojha
02:12 AM Bug #52969: use "ceph df" command found pool max avail increase when there are degraded objects i...
My solution is to add a function del_down_out_osd() to PGMap::get_rule_avail() to calculate the avail value of the st... minghang zhao
01:05 AM Bug #52969 (Fix Under Review): use "ceph df" command found pool max avail increase when there are...
down former:
--- POOLS ---
POOL ID STORED OBJECTS USED %USED MAX AVAIL
device_health_met...
minghang zhao
01:40 AM Bug #52385: a possible data loss due to recovery_unfound PG after restarting all nodes
> Please provide osd logs (roughly 10 mins) from all replicas with debug_osd=20, debug_ms=1, when the osds are restar... Satoru Takeuchi

10/18/2021

10:24 PM Bug #51576: qa/tasks/radosbench.py times out
... Neha Ojha
05:54 PM Bug #52967 (New): premerge pgs may be backfill_wait for a long time
... Sage Weil

10/17/2021

11:27 AM Bug #44184: Slow / Hanging Ops after pool creation
Ist Gab wrote:
> > Are you sure recovery_deletes is set in the OSDMap?
>
> yeah, these are the flags:
> flags no...
Wido den Hollander

10/15/2021

04:59 PM Bug #15546: json numerical output is between quotes
This seems to be okay now. Is it possible that this issue was fixed without getting updated?... Laura Flores
02:58 PM Bug #52948 (New): osd: fails to come up: "teuthology.misc:7 of 8 OSDs are up"
... Patrick Donnelly
02:06 PM Bug #52513: BlueStore.cc: 12391: ceph_abort_msg(\"unexpected error\") on operation 15
I was reproduced this issue:
# put rados object to pool
# get mapping for this object...
Konstantin Shalygin
12:47 PM Bug #44184: Slow / Hanging Ops after pool creation
> Are you sure recovery_deletes is set in the OSDMap?
yeah, these are the flags:
flags noout,sortbitwise,recovery...
Ist Gab
11:58 AM Bug #44184: Slow / Hanging Ops after pool creation
Ist Gab wrote:
> Wido den Hollander wrote:
> > On a cluster with 1405 OSDs I've ran into a situation for the second...
Wido den Hollander
12:20 PM Support #52881: Filtered out host node3.foo.com: does not belong to mon public_network ()
I can now answer my question by myself. It was a missconfiguration.
After I entered the unmanaged mode with
@$ ...
Ralph Soika

10/14/2021

07:14 PM Bug #52867: pick_address.cc prints: unable to find any IPv4 address in networks 'fd00:fd00:fd00:3...
As per comment #3 I was on the right path but I should have set an OSD setting, not a mon setting. If I run the follo... John Fulton
06:15 AM Bug #52867: pick_address.cc prints: unable to find any IPv4 address in networks 'fd00:fd00:fd00:3...
@John,
per the logging message pasted at http://ix.io/3B1y...
Kefu Chai
04:53 PM Bug #51527 (Pending Backport): Ceph osd crashed due to segfault
Just sent https://github.com/ceph/ceph/pull/43548 for pacific. Nothing more to do here, I think. Radoslaw Zarzynski
04:32 PM Bug #51527: Ceph osd crashed due to segfault
This is a known issue that has been fixed in master by commit https://github.com/ceph/ceph/commit/d51d80b3234e1769006... Radoslaw Zarzynski
03:56 PM Backport #52938 (In Progress): nautilus: Primary OSD crash caused corrupted object and further cr...
Mykola Golub
02:46 PM Backport #52938 (Rejected): nautilus: Primary OSD crash caused corrupted object and further crash...
https://github.com/ceph/ceph/pull/43547 Backport Bot
03:51 PM Backport #52937 (In Progress): octopus: Primary OSD crash caused corrupted object and further cra...
Mykola Golub
02:46 PM Backport #52937 (Rejected): octopus: Primary OSD crash caused corrupted object and further crashe...
https://github.com/ceph/ceph/pull/43545 Backport Bot
03:47 PM Backport #52936 (In Progress): pacific: Primary OSD crash caused corrupted object and further cra...
Mykola Golub
02:46 PM Backport #52936 (Resolved): pacific: Primary OSD crash caused corrupted object and further crashe...
https://github.com/ceph/ceph/pull/43544 Backport Bot
02:43 PM Bug #48959 (Pending Backport): Primary OSD crash caused corrupted object and further crashes duri...
Kefu Chai
02:36 PM Bug #52815 (Resolved): exact_timespan_str()
Kefu Chai
12:49 PM Bug #42861: Libceph-common.so needs to use private link attribute when including dpdk static library
link with shared libs,it can find ethernet,and test passed.
[root@ceph1 aarch64-openEuler-linux-gnu]# cat ./src/test...
chunsong feng
12:49 PM Bug #42861: Libceph-common.so needs to use private link attribute when including dpdk static library
link with static lib ,can't find ethernet port
[root@ceph1 aarch64-openEuler-linux-gnu]# cat ./src/test/msgr/CMakeFi...
chunsong feng
12:49 PM Bug #52930 (New): Cannot get 'quorum_status' output from socket file
As per doc[1] if we try to get quorum_status it fails with invalid command.
Is this no longer available in ceph 16.x...
Karun Josy
10:31 AM Bug #52513: BlueStore.cc: 12391: ceph_abort_msg(\"unexpected error\") on operation 15
@Neha, I will try to reproduce this via removing objects from replicas Konstantin Shalygin
07:04 AM Bug #52925: pg peering alway after trigger async recovery
PR: https://github.com/ceph/ceph/pull/43534 yite gu
06:40 AM Bug #52925 (Closed): pg peering alway after trigger async recovery
my ceph version 14.2.21,I want to test pg async recovery function, so I olny set osd.9 config "osd_async_recovery_mi... yite gu

10/13/2021

07:09 PM Bug #51076: "wait_for_recovery: failed before timeout expired" during thrashosd test with EC back...
... Deepika Upadhyay
05:58 PM Bug #52162: crash: int MonitorDBStore::apply_transaction(MonitorDBStore::TransactionRef): abort
Hey Josh,
This issue is marked as 'Duplicate', but it does not have a 'duplicates' relation, but a 'relates to' on...
Yaarit Hatuka
01:53 PM Feature #52609: New PG states for pending scrubs / repairs
Looks good to me! Thanks Ronen Michael Kidd
07:21 AM Bug #52901 (Fix Under Review): osd/scrub: setting then clearing noscrub may lock a PG in 'scrubbi...
Ronen Friedman

10/12/2021

10:25 PM Bug #52872 (Fix Under Review): LibRadosTwoPoolsPP.ManifestSnapRefcount Failure.
Neha Ojha
03:20 AM Bug #52872: LibRadosTwoPoolsPP.ManifestSnapRefcount Failure.
https://github.com/ceph/ceph/pull/43493 Myoungwon Oh
09:12 PM Backport #52620 (In Progress): pacific: partial recovery become whole object recovery after resta...
Neha Ojha
09:11 PM Backport #52770 (In Progress): pacific: pg scrub stat mismatch with special objects that have has...
Neha Ojha
09:10 PM Backport #52843 (In Progress): pacific: msg/async/ProtocalV2: recv_stamp of a message is set to a...
Neha Ojha
06:46 PM Bug #52578 (Fix Under Review): CLI - osd pool rm --help message is wrong or misleading
Neha Ojha
04:30 PM Bug #52901 (Resolved): osd/scrub: setting then clearing noscrub may lock a PG in 'scrubbing' state
Recent scrub scheduling code errs in (at one location) incorrectly considering noscrub as not
precluding deep-scrub.
Ronen Friedman
02:42 PM Bug #51463: blocked requests while stopping/starting OSDs
Sure.
Simple cluster with 5 nodes 125 OSDs in total
one pool replicated size 3, min_size 1
at least this in t...
Manuel Lausch
09:23 AM Feature #52609 (Fix Under Review): New PG states for pending scrubs / repairs
Please see the updated proposed change (https://github.com/ceph/ceph/pull/43403 - new comment from today).
I hope it...
Ronen Friedman
06:47 AM Bug #44184: Slow / Hanging Ops after pool creation
Wido den Hollander wrote:
> On a cluster with 1405 OSDs I've ran into a situation for the second time now where a po...
Ist Gab

10/11/2021

10:28 PM Bug #52385: a possible data loss due to recovery_unfound PG after restarting all nodes
Satoru Takeuchi wrote:
> > Can you share the full set of logs using ceph-post-file (https://docs.ceph.com/en/pacific...
Neha Ojha
10:25 PM Backport #52893 (Rejected): octopus: ceph-kvstore-tool repair segmentfault without bluestore-kv
Backport Bot
10:25 PM Backport #52892 (Resolved): pacific: ceph-kvstore-tool repair segmentfault without bluestore-kv
https://github.com/ceph/ceph/pull/51254 Backport Bot
10:24 PM Bug #52756 (Pending Backport): ceph-kvstore-tool repair segmentfault without bluestore-kv
based on https://tracker.ceph.com/issues/52756#note-2, looks like the fix needs to be backported all the way. Neha Ojha
10:20 PM Bug #52513 (Need More Info): BlueStore.cc: 12391: ceph_abort_msg(\"unexpected error\") on operati...
Can you capture a coredump or osd logs with debug_osd=20,debug_bluestore=20,debug_ms=1 if this crash is reproducible? Neha Ojha
10:13 PM Bug #52126 (Pending Backport): stretch mode: allow users to change the tiebreaker monitor
Neha Ojha
10:13 PM Bug #52872: LibRadosTwoPoolsPP.ManifestSnapRefcount Failure.
Myoungwon Oh : I am assigning it to you, in case you have any thoughts on this issue. Feel free to un-assign if you d... Neha Ojha
10:08 PM Bug #52884 (Fix Under Review): osd: optimize pg peering latency when add new osd that need backfill
Neha Ojha
05:29 AM Bug #52884: osd: optimize pg peering latency when add new osd that need backfill
https://github.com/ceph/ceph/pull/43482 jianwei zhang
05:28 AM Bug #52884 (Fix Under Review): osd: optimize pg peering latency when add new osd that need backfill
Reproduce:
(1) ceph cluster not running any client IO
(2) only ceph osd in osd.14 operation ( add new osd to cluste...
jianwei zhang
10:07 PM Bug #52886: osd: backfill reservation does not take compression into account
I'll create a trello card to track this, I think the initial toofull implementation was intentionally kept simple, bu... Neha Ojha
08:42 AM Bug #52886 (New): osd: backfill reservation does not take compression into account
The problem may be observed with the recently added backfill-toofull test when it runs with bluestore-comp-lz4. When ... Mykola Golub
10:02 PM Bug #51463: blocked requests while stopping/starting OSDs
Is it possible for you to share your test reproducer with us? It would be great if we could run it against a vstart c... Neha Ojha
05:19 PM Bug #52889 (Triaged): upgrade tests fails because of missing ceph-volume package
We need something like 4e525127fbb710c1ac074cf61b448055781a69e3, for octopus-x as well. Neha Ojha
04:43 PM Bug #52889 (Triaged): upgrade tests fails because of missing ceph-volume package
Recently we seperated ceph-volume from ceph-base to a seperate one, have not looked deeply into it right now, but I s... Deepika Upadhyay

10/10/2021

07:03 PM Documentation #22843 (Won't Fix): [doc][luminous] the configuration guide still contains osd_op_t...
Anthony D'Atri
07:02 PM Documentation #7386 (Won't Fix): librados: document rados_osd_op_timeout and rados_mon_op_timeout...
Seven years old, marked @Advanced@, making a judgement call and closing this. If you disagree, let me know and I'll ... Anthony D'Atri

10/09/2021

10:27 PM Support #52881 (New): Filtered out host node3.foo.com: does not belong to mon public_network ()
I am running a Ceph Pacific cluster ( version 16.2.6) consisting of 3 nodes with public Internet Addresses. I also ha... Ralph Soika
07:43 PM Bug #52867: pick_address.cc prints: unable to find any IPv4 address in networks 'fd00:fd00:fd00:3...
I set the following after bootstrap and before adding any OSDs but I got the same error.
`ceph config set mon ms_b...
John Fulton
03:12 PM Bug #52878 (Fix Under Review): qa/tasks: python3 'dict' object has no attribute 'iterkeys' error
Kefu Chai
08:07 AM Bug #52878 (Resolved): qa/tasks: python3 'dict' object has no attribute 'iterkeys' error
os: CentOS8.4
ceph version: ceph16.2.4
Teuthology error log:
2021-08-27T09:18:09.787 DEBUG:teuthology.orchestra....
Zhiwei Dai
12:05 AM Bug #39150: mon: "FAILED ceph_assert(session_map.sessions.empty())" when out of quorum
... Deepika Upadhyay

10/08/2021

10:22 PM Bug #52640: when osds out,reduce pool size reports a error "Error ERANGE: pool id # pg_num 256 si...
merged https://github.com/ceph/ceph/pull/43324 Yuri Weinstein
09:06 PM Bug #45202: Repeatedly OSD crashes in PrimaryLogPG::hit_set_trim()
All evidences pointing to a firmware bug persisting in dozens of used Hitachi 12TB (HGST model HUH721212AL5200) HDDs ... Dmitry Smirnov
12:18 PM Bug #52872 (Pending Backport): LibRadosTwoPoolsPP.ManifestSnapRefcount Failure.
/a/yuriw-2021-10-04_21:49:48-rados-wip-yuri4-testing-2021-10-04-1236-distro-basic-smithi/6421867... Sridhar Seshasayee

10/07/2021

11:00 PM Backport #52868 (In Progress): stretch mode: allow users to change the tiebreaker monitor
Greg Farnum
10:52 PM Backport #52868 (Resolved): stretch mode: allow users to change the tiebreaker monitor
https://github.com/ceph/ceph/pull/43457 Greg Farnum
08:59 PM Bug #52867 (New): pick_address.cc prints: unable to find any IPv4 address in networks 'fd00:fd00:...
When using IPv6 for my public and cluster network my mon is able to bootstrap (because I have [1]) but I end up with ... John Fulton
02:25 PM Backport #52809: octopus: ceph-erasure-code-tool: new tool to encode/decode files
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/43407
merged
Yuri Weinstein
12:40 AM Backport #52845 (Rejected): pacific: osd: add scrub duration to pg dump
https://github.com/ceph/ceph/pull/43704 Backport Bot
12:35 AM Bug #52605 (Pending Backport): osd: add scrub duration to pg dump
Neha Ojha

10/06/2021

10:45 PM Backport #52843 (Resolved): pacific: msg/async/ProtocalV2: recv_stamp of a message is set to a wr...
https://github.com/ceph/ceph/pull/43511 Backport Bot
10:45 PM Backport #52842 (Rejected): octopus: msg/async/ProtocalV2: recv_stamp of a message is set to a wr...
Backport Bot
10:45 PM Backport #52841 (Resolved): pacific: shard-threads cannot wakeup bug
https://github.com/ceph/ceph/pull/51262 Backport Bot
10:45 PM Backport #52840 (Rejected): octopus: shard-threads cannot wakeup bug
Backport Bot
10:45 PM Backport #52839 (Resolved): pacific: rados: build minimally when "WITH_MGR" is off
https://github.com/ceph/ceph/pull/51250 Backport Bot
10:45 PM Backport #52838 (Rejected): octopus: rados: build minimally when "WITH_MGR" is off
Backport Bot
10:43 PM Bug #52796 (Pending Backport): rados: build minimally when "WITH_MGR" is off
Kefu Chai
10:42 PM Bug #52781 (Pending Backport): shard-threads cannot wakeup bug
Kefu Chai
10:40 PM Bug #52739 (Pending Backport): msg/async/ProtocalV2: recv_stamp of a message is set to a wrong value
Kefu Chai
06:24 PM Bug #48965: qa/standalone/osd/osd-force-create-pg.sh: TEST_reuse_id: return 1
https://pulpito.ceph.com/kchai-2021-10-05_16:14:10-rados-wip-kefu-testing-2021-10-05-2221-distro-basic-smithi/6423567 Greg Farnum
05:56 PM Backport #52832 (In Progress): nautilus: osd: pg may get stuck in backfill_toofull after backfill...
Mykola Golub
04:35 PM Backport #52832 (Rejected): nautilus: osd: pg may get stuck in backfill_toofull after backfill is...
https://github.com/ceph/ceph/pull/43439 Backport Bot
05:40 PM Backport #52833 (In Progress): octopus: osd: pg may get stuck in backfill_toofull after backfill ...
Mykola Golub
04:35 PM Backport #52833 (Resolved): octopus: osd: pg may get stuck in backfill_toofull after backfill is ...
https://github.com/ceph/ceph/pull/43438 Backport Bot
05:39 PM Backport #52831 (In Progress): pacific: osd: pg may get stuck in backfill_toofull after backfill ...
Mykola Golub
04:35 PM Backport #52831 (Resolved): pacific: osd: pg may get stuck in backfill_toofull after backfill is ...
https://github.com/ceph/ceph/pull/43437 Backport Bot
04:31 PM Bug #52448 (Pending Backport): osd: pg may get stuck in backfill_toofull after backfill is interr...
Mykola Golub

10/05/2021

02:56 PM Backport #51552: octopus: rebuild-mondb hangs
Kefu Chai wrote:
> https://github.com/ceph/ceph/pull/43263
merged
Yuri Weinstein
02:46 PM Backport #51569: octopus: pool last_epoch_clean floor is stuck after pg merging
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/42837
merged
Yuri Weinstein
12:53 PM Bug #52815 (Fix Under Review): exact_timespan_str()
Ronen Friedman
12:25 PM Bug #52815 (Resolved): exact_timespan_str()
exact_timespan_str() in ceph_time.cc handles some specific time-spans incorrectly:
150.567 seconds, for example, w...
Ronen Friedman
04:06 AM Bug #43580 (Resolved): pg: fastinfo incorrect when last_update moves backward in time
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Loïc Dachary
04:06 AM Bug #44798 (Resolved): librados mon_command (mgr) command hang
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Loïc Dachary
04:05 AM Bug #48611 (Resolved): osd: Delay sending info to new backfill peer resetting last_backfill until...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Loïc Dachary
04:04 AM Bug #49894 (Resolved): set a non-zero default value for osd_client_message_cap
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Loïc Dachary
04:03 AM Bug #51000 (Resolved): LibRadosTwoPoolsPP.ManifestSnapRefcount failure
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Loïc Dachary
04:02 AM Bug #51419 (Resolved): bufferlist::splice() may cause stack corruption in bufferlist::rebuild_ali...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Loïc Dachary
04:02 AM Bug #51627 (Resolved): FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing().is_...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Loïc Dachary
04:00 AM Backport #51952 (Resolved): pacific: FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/43099
m...
Loïc Dachary
04:00 AM Backport #52322 (Resolved): pacific: LibRadosTwoPoolsPP.ManifestSnapRefcount failure
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/43306
m...
Loïc Dachary
03:59 AM Backport #51117: pacific: osd: Run osd bench test to override default max osd capacity for mclock.
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/41731
m...
Loïc Dachary
03:55 AM Backport #51555 (Resolved): octopus: mon: return -EINVAL when handling unknown option in 'ceph os...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/43266
m...
Loïc Dachary
03:55 AM Backport #51967 (Resolved): octopus: set a non-zero default value for osd_client_message_cap
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/42616
m...
Loïc Dachary
03:52 AM Backport #51604 (Resolved): octopus: bufferlist::splice() may cause stack corruption in bufferlis...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/42975
m...
Loïc Dachary
12:03 AM Bug #52126 (Fix Under Review): stretch mode: allow users to change the tiebreaker monitor
Greg Farnum

10/04/2021

09:33 PM Feature #52609 (In Progress): New PG states for pending scrubs / repairs
Neha Ojha
03:50 PM Bug #45202: Repeatedly OSD crashes in PrimaryLogPG::hit_set_trim()
Igor Fedotov wrote:
> Dmitry Smirnov wrote:
>
> Hey Dmitry,
> I'm not sure why you're saying this is the same co...
Igor Fedotov
03:47 PM Bug #45202: Repeatedly OSD crashes in PrimaryLogPG::hit_set_trim()
Dmitry Smirnov wrote:
> We face the same issue with simultaneously four OSD down events at 4 different hosts and ina...
Igor Fedotov
08:46 AM Bug #50683 (Rejected): [RBD] master - cluster [WRN] Health check failed: mon is allowing insecure...
Ilya Dryomov
08:34 AM Backport #52808 (In Progress): nautilus: ceph-erasure-code-tool: new tool to encode/decode files
Mykola Golub
08:15 AM Backport #52808 (Rejected): nautilus: ceph-erasure-code-tool: new tool to encode/decode files
https://github.com/ceph/ceph/pull/43408 Backport Bot
08:17 AM Backport #52809 (In Progress): octopus: ceph-erasure-code-tool: new tool to encode/decode files
Mykola Golub
08:15 AM Backport #52809 (Resolved): octopus: ceph-erasure-code-tool: new tool to encode/decode files
https://github.com/ceph/ceph/pull/43407 Backport Bot
08:14 AM Bug #52807 (Resolved): ceph-erasure-code-tool: new tool to encode/decode files
This tool has already been pushed into pre-pacific master [1] and we have it since Pacific.
There is a demand from...
Mykola Golub

10/03/2021

06:41 PM Bug #45202: Repeatedly OSD crashes in PrimaryLogPG::hit_set_trim()
We face the same issue with simultaneously four OSD down events at 4 different hosts and inability to restart them al... Dmitry Smirnov
02:29 PM Bug #50657: smart query on monitors
Just wanted to add that we have similar situation where we have 3 dedicated mon nodes, each running in their own cont... Matthew Darwin

10/01/2021

02:07 PM Bug #51463: blocked requests while stopping/starting OSDs
This is still a issue- In the newest Pacific release (16.2.5) as well
The developer documentation mentioned above ...
Manuel Lausch
01:03 AM Bug #52796 (Resolved): rados: build minimally when "WITH_MGR" is off
Minimize footprint of the MGR when WITH_MGR is off. Include the minimal in MON. Don't include any MGR tests.
J. Eric Ivancich
12:51 AM Bug #52781 (Fix Under Review): shard-threads cannot wakeup bug
Kefu Chai

09/30/2021

10:53 PM Backport #51952: pacific: FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing()....
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/43099
merged
Yuri Weinstein
10:52 PM Backport #52322: pacific: LibRadosTwoPoolsPP.ManifestSnapRefcount failure
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/43306
merged
Yuri Weinstein
08:01 PM Backport #52792 (Rejected): octopus: common/TrackedOp.cc: 163: FAILED ceph_assert((sharded_in_fli...
Backport Bot
08:01 PM Backport #52791 (Resolved): pacific: common/TrackedOp.cc: 163: FAILED ceph_assert((sharded_in_fli...
https://github.com/ceph/ceph/pull/51249 Backport Bot
08:01 PM Bug #51527 (New): Ceph osd crashed due to segfault
Neha Ojha
02:54 PM Bug #51527: Ceph osd crashed due to segfault
I was able to reproduce on pacific (v16.2.6). Log attached. J. Eric Ivancich
07:59 PM Bug #44715 (Pending Backport): common/TrackedOp.cc: 163: FAILED ceph_assert((sharded_in_flight_li...
Neha Ojha
02:52 PM Bug #44715: common/TrackedOp.cc: 163: FAILED ceph_assert((sharded_in_flight_list.back())->ops_in_...
https://github.com/ceph/ceph/pull/34624 merged Yuri Weinstein
09:38 AM Bug #52761: OSDs announcing incorrect front_addr after upgrade to 16.2.6
I have kept restarting the incorrectly configured osds daemons until they got the right front_addr. In some cases it ... Javier Cacheiro
09:23 AM Bug #52761: OSDs announcing incorrect front_addr after upgrade to 16.2.6
Upgraded from v16.2.6 to v16.2.6-20210927 to apply the remoto bug fix.
After the upgrade (no reboot of the nodes b...
Javier Cacheiro
09:19 AM Support #52786 (New): processing of finisher is a block box, any means to observate it?
Jack Lv
08:34 AM Bug #52781: shard-threads cannot wakeup bug
https://github.com/ceph/ceph/pull/43360 jianwei zhang
08:34 AM Bug #52781 (Resolved): shard-threads cannot wakeup bug
osd: fix shard-threads cannot wakeup bug
Reproduce:
(1) ceph cluster not running any client IO
(2) only ceph osd...
jianwei zhang

09/29/2021

06:48 PM Feature #52609: New PG states for pending scrubs / repairs
While I agree the format is readable, it's a bit narrow in application.
Would it be a significant undertaking to:
...
Michael Kidd
07:11 AM Feature #52609: New PG states for pending scrubs / repairs
That schedule element seems like a pretty reasonable human-readable summary. Samuel Just
05:18 PM Bug #51527: Ceph osd crashed due to segfault
I've attached the shell script "load-bi.sh". It requires that a cluster be brought up with RGW. It requires that a bu... J. Eric Ivancich
04:00 PM Bug #52756: ceph-kvstore-tool repair segmentfault without bluestore-kv
... huang jun
03:51 PM Bug #52756: ceph-kvstore-tool repair segmentfault without bluestore-kv
huang jun wrote:
> [...]
The backtrace like this:...
huang jun
03:09 PM Bug #52756 (Fix Under Review): ceph-kvstore-tool repair segmentfault without bluestore-kv
Kefu Chai
07:26 AM Bug #52756 (Resolved): ceph-kvstore-tool repair segmentfault without bluestore-kv
... huang jun
03:00 PM Backport #52771 (Rejected): nautilus: pg scrub stat mismatch with special objects that have hash ...
Backport Bot
03:00 PM Backport #52770 (Resolved): pacific: pg scrub stat mismatch with special objects that have hash '...
https://github.com/ceph/ceph/pull/43512 Backport Bot
03:00 PM Backport #52769 (Resolved): octopus: pg scrub stat mismatch with special objects that have hash '...
Backport Bot
12:25 PM Bug #52761: OSDs announcing incorrect front_addr after upgrade to 16.2.6
In some cases it requires several daemon restarts until it gets to the right configuration.
I don't know if the wr...
Javier Cacheiro
10:15 AM Bug #52761: OSDs announcing incorrect front_addr after upgrade to 16.2.6
Restarting the daemons seems to get the correct configuration but it is unclear why this did not happen when they wer... Javier Cacheiro
10:00 AM Bug #52761: OSDs announcing incorrect front_addr after upgrade to 16.2.6
Just as statistics, there are now:
- 51 cases where there is an error in the front_addr or hb_front_addr configura...
Javier Cacheiro
09:52 AM Bug #52761 (New): OSDs announcing incorrect front_addr after upgrade to 16.2.6
Ceph cluster configured with a public and cluster network:
>> ceph config dump|grep network
global advanced cl...
Javier Cacheiro
09:19 AM Bug #52760 (Need More Info): Monitor unable to rejoin the cluster
Our cluster has three monitors.
After a restart one of our monitors failed to join the cluster with:
Sep 24 07:52...
Ruben Kerkhof
 

Also available in: Atom