Project

General

Profile

Activity

From 07/17/2020 to 08/15/2020

08/15/2020

01:31 PM Backport #46964 (In Progress): octopus: Pool stats increase after PG merged (PGMap::apply_increme...
Nathan Cutler
01:31 PM Backport #46934 (In Progress): octopus: "No such file or directory" when exporting or importing a...
Nathan Cutler
01:30 PM Backport #46739 (In Progress): octopus: mon: expected_num_objects warning triggers on bluestore-o...
Nathan Cutler
01:29 PM Backport #46722 (In Progress): octopus: osd/osd-bench.sh 'tell osd.N bench' hang
Nathan Cutler
01:29 PM Backport #46709 (In Progress): octopus: Negative peer_num_objects crashes osd
Nathan Cutler
01:28 PM Backport #46595 (In Progress): octopus: crash in Objecter and CRUSH map lookup
Nathan Cutler
01:27 PM Backport #46586 (In Progress): octopus: The default value of osd_scrub_during_recovery is false s...
Nathan Cutler
12:28 PM Backport #46932 (Need More Info): nautilus: librados: add LIBRBD_SUPPORTS_GETADDRS support
nautilus backport needs df507cde8d71 but backporting that to nautilus is non-trivial Nathan Cutler
12:23 PM Feature #46842: librados: add LIBRBD_SUPPORTS_GETADDRS support
Jason Dillaman wrote:
> Backports need to also include the new method from commit df507cde8d71
Note: that commit ...
Nathan Cutler
12:23 PM Backport #46931 (In Progress): octopus: librados: add LIBRBD_SUPPORTS_GETADDRS support
Nathan Cutler

08/14/2020

04:16 PM Bug #46978 (Resolved): OSD: shutdown of a OSD Host causes slow requests
Hi,
while stopping all OSDs on a host I get for some seconds slow ops. Sometimes this don't happen and mostly it w...
Manuel Lausch
09:58 AM Backport #46952 (In Progress): nautilus: nautilis client may hunt for mon very long if msg v2 is ...
Nathan Cutler
09:56 AM Backport #46951 (In Progress): octopus: nautilis client may hunt for mon very long if msg v2 is n...
Nathan Cutler
08:36 AM Feature #41564 (Resolved): Issue health status warning if num_shards_repaired exceeds some threshold
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
08:04 AM Bug #46969 (New): Octopus OSDs deadlock with slow ops and make the whole cluster unresponsive
Hi,
I have another unpleasant bug to report.
Right after I upgraded my cluster to Octopus 15.2.4 I started to e...
Vitaliy Filippov

08/13/2020

11:34 PM Backport #46096 (Resolved): nautilus: Issue health status warning if num_shards_repaired exceeds ...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36379
m...
Nathan Cutler
08:53 PM Bug #41944 (Resolved): inconsistent pool count in ceph -s output
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
08:50 PM Bug #44755 (Resolved): Create stronger affinity between drivegroup specs and osd daemons
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
08:50 PM Backport #46965 (Resolved): nautilus: Pool stats increase after PG merged (PGMap::apply_increment...
https://github.com/ceph/ceph/pull/37476 Nathan Cutler
08:50 PM Backport #46964 (Resolved): octopus: Pool stats increase after PG merged (PGMap::apply_incrementa...
https://github.com/ceph/ceph/pull/36667 Nathan Cutler
08:48 PM Bug #46275 (Resolved): Cancellation of on-going scrubs
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
08:47 PM Bug #46443 (Resolved): ceph_osd crash in _committed_osd_maps when failed to encode first inc map
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
08:47 PM Backport #46952 (Resolved): nautilus: nautilis client may hunt for mon very long if msg v2 is not...
https://github.com/ceph/ceph/pull/36634 Nathan Cutler
08:47 PM Backport #46951 (Resolved): octopus: nautilis client may hunt for mon very long if msg v2 is not ...
https://github.com/ceph/ceph/pull/36633 Nathan Cutler
08:45 PM Backport #46935 (Resolved): nautilus: "No such file or directory" when exporting or importing a p...
https://github.com/ceph/ceph/pull/37475 Nathan Cutler
08:45 PM Backport #46934 (Resolved): octopus: "No such file or directory" when exporting or importing a po...
https://github.com/ceph/ceph/pull/36666 Nathan Cutler
08:44 PM Backport #46932 (Resolved): nautilus: librados: add LIBRBD_SUPPORTS_GETADDRS support
https://github.com/ceph/ceph/pull/36853 Nathan Cutler
08:44 PM Backport #46931 (Resolved): octopus: librados: add LIBRBD_SUPPORTS_GETADDRS support
https://github.com/ceph/ceph/pull/36643 Nathan Cutler
07:57 PM Backport #46707 (Resolved): octopus: Cancellation of on-going scrubs
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36291
m...
Nathan Cutler
07:24 PM Backport #46742 (Resolved): octopus: ceph_osd crash in _committed_osd_maps when failed to encode ...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36340
m...
Nathan Cutler
12:28 AM Bug #46914 (Fix Under Review): mon: stuck osd_pgtemp message forwards
https://github.com/ceph/ceph/pull/36593 Greg Farnum

08/12/2020

11:42 PM Bug #43893: lingering osd_failure ops (due to failure_info holding references?)
Looking at this ticket again it's not a no_reply issue at all -- the last event is it getting marked as no_reply, but... Greg Farnum
11:24 PM Bug #46914: mon: stuck osd_pgtemp message forwards
Looking through the source, there's one clear way this happens: the leader may decide that a message can get dropped ... Greg Farnum
11:22 PM Bug #46914 (Resolved): mon: stuck osd_pgtemp message forwards
https://bugzilla.redhat.com/show_bug.cgi?id=1866257
We've seen osd_pgtemp messages which are forwarded to the lead...
Greg Farnum
06:21 PM Bug #46889 (Need More Info): librados: crashed in service_daemon_update_status
Are there any logs or coredump available? What version was this? Josh Durgin
02:24 PM Backport #46096: nautilus: Issue health status warning if num_shards_repaired exceeds some threshold
Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/36379
merged
Yuri Weinstein

08/11/2020

12:12 PM Bug #46847: Loss of placement information on OSD reboot
I repeated the experiment and the result is very different from the other case descriptions. Apparently, part of the ... Frank Schilder
12:31 AM Bug #46732: teuthology.exceptions.MaxWhileTries: 'check for active or peered' reached maximum tri...
Unable to reproduce. Brad Hubbard
12:10 AM Bug #46889 (Need More Info): librados: crashed in service_daemon_update_status
... Xiubo Li

08/10/2020

09:53 PM Bug #46845 (Fix Under Review): Newly orchestrated OSD fails with 'unable to find any IPv4 address...
Neha Ojha
05:18 PM Backport #46707: octopus: Cancellation of on-going scrubs
David Zafman wrote:
> https://github.com/ceph/ceph/pull/36291
merged
Yuri Weinstein
01:05 PM Bug #46881 (New): failure on gcc 9.3.0 with Boost 1.71
... Deepika Upadhyay
09:31 AM Bug #44815 (Pending Backport): Pool stats increase after PG merged (PGMap::apply_incremental does...
Kefu Chai
09:30 AM Bug #44815 (Resolved): Pool stats increase after PG merged (PGMap::apply_incremental doesn't subt...
Kefu Chai
04:03 AM Bug #46405: osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
/a/yuriw-2020-08-06_00:31:28-rados-wip-yuri8-testing-octopus-distro-basic-smithi/5291111 Brad Hubbard
03:48 AM Bug #45441: rados: Health check failed: 1/3 mons down, quorum a,c (MON_DOWN)" in cluster log'
/a/yuriw-2020-08-06_00:31:28-rados-wip-yuri8-testing-octopus-distro-basic-smithi/5290942 Brad Hubbard
03:32 AM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
/a/yuriw-2020-08-06_00:31:28-rados-wip-yuri8-testing-octopus-distro-basic-smithi/5291043 Brad Hubbard
03:20 AM Bug #46877 (New): mon_clock_skew_check: expected MON_CLOCK_SKEW but got none
http://sentry.ceph.com/sepia/teuthology/issues/2921/ only seen rarely so setting sev/prio accordingly
/a/yuriw-202...
Brad Hubbard
02:05 AM Bug #46876: osd/ECBackend: optimize remaining read as readop contain multiple objects
https://github.com/ceph/ceph/pull/35821 Zengran Zhang
02:04 AM Bug #46876 (Resolved): osd/ECBackend: optimize remaining read as readop contain multiple objects
in https://github.com/ceph/ceph/pull/21911
we s/rop.to_read.swap/rop.to_read.erase/ in send_all_remaining_reads(...)...
Zengran Zhang

08/08/2020

03:29 AM Bug #46845 (In Progress): Newly orchestrated OSD fails with 'unable to find any IPv4 address in n...
Cool, I've tracked down what's happening. Will push the first version of a patch up on Monday. I think if we get an I... Matthew Oliver

08/07/2020

12:42 PM Feature #46842 (Pending Backport): librados: add LIBRBD_SUPPORTS_GETADDRS support
Backports need to also include the new method from commit df507cde8d71 Jason Dillaman
07:33 AM Bug #46847: Loss of placement information on OSD reboot
Thanks a lot for this info. There have been a few more scenarios discussed on the users-list, all involving changes t... Frank Schilder
06:51 AM Bug #46845: Newly orchestrated OSD fails with 'unable to find any IPv4 address in networks '2001:...
Matthew Oliver wrote:
> I've managed to recreate the issue in a vstart env. It happens when I use ipv6 but set the `...
Daniël Vos
02:08 AM Bug #46845: Newly orchestrated OSD fails with 'unable to find any IPv4 address in networks '2001:...
I've managed to recreate the issue in a vstart env. It happens when I use ipv6 but set the `public network` to an ipv... Matthew Oliver

08/06/2020

10:35 PM Bug #46264 (Fix Under Review): mon: check for mismatched daemon versions
Neha Ojha
05:59 PM Backport #46742: octopus: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36340
merged
Yuri Weinstein
05:14 PM Bug #46847: Loss of placement information on OSD reboot
We have had this problem for a long time, one reason was resolved in #37439. But it still persists in some cases, and... Jonas Jelten
01:59 PM Bug #46847 (Need More Info): Loss of placement information on OSD reboot
During rebalancing after adding new disks to a cluster, the cluster looses placement information on reboot of an "old... Frank Schilder
09:24 AM Bug #46845: Newly orchestrated OSD fails with 'unable to find any IPv4 address in networks '2001:...
I think this is duplicate of https://tracker.ceph.com/issues/39711
The workaround was to disable `ms_bind_ipv4`, a...
Matthew Oliver
08:33 AM Bug #46845 (Resolved): Newly orchestrated OSD fails with 'unable to find any IPv4 address in netw...
I just started deploying 60 OSDs to my new 15.2.4 OCtopus IPv6 cephadm cluster. I applied the spec for the OSDs and t... Daniël Vos
05:27 AM Bug #46829: Periodic Lagged/Stalled PGs on new Cluster
This is on current master, right? Samuel Just
05:25 AM Bug #46829: Periodic Lagged/Stalled PGs on new Cluster
Can you paste two adjacent cycles? I'm curious about the timestamp of the subsequent cycle. Samuel Just
04:25 AM Feature #46842 (Resolved): librados: add LIBRBD_SUPPORTS_GETADDRS support
This will be very helpful when release ceph package(like RPM) with backporting rados_getaddrs() api to previous versi... Xiubo Li

08/04/2020

06:29 PM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
rados/basic/{ceph clusters/{fixed-2 openstack} msgr-failures/many msgr/async-v2only objectstore/bluestore-comp-zstd r... Neha Ojha
06:08 PM Bug #43048: nautilus: upgrade/mimic-x/stress-split: failed to recover before timeout expired
/a/teuthology-2020-08-01_16:48:51-upgrade:mimic-x-nautilus-distro-basic-smithi/5277468 Neha Ojha
05:01 PM Bug #46829 (New): Periodic Lagged/Stalled PGs on new Cluster
While Radek and I have been working on improving bufferlist append overhead I've been noticing that periodically I en... Mark Nelson
04:12 PM Bug #21592: LibRadosCWriteOps.CmpExt got 0 instead of -4095-1
/a/yuriw-2020-08-01_15:45:48-rados-nautilus-distro-basic-smithi/5276330 Neha Ojha
08:47 AM Bug #46824 (Pending Backport): "No such file or directory" when exporting or importing a pool if ...
Kefu Chai
04:48 AM Bug #46824 (Fix Under Review): "No such file or directory" when exporting or importing a pool if ...
Kefu Chai
04:45 AM Bug #46824 (Resolved): "No such file or directory" when exporting or importing a pool if locator ...
Fixes the following error when exporting a pool that contains objects
with a locator key set:...
Kefu Chai

08/03/2020

01:05 PM Bug #46816 (Resolved): mon stat prints plain text with -f json
... Jan Fajerski

08/02/2020

02:46 PM Bug #43413: Virtual IP address of iface lo results in failing to start an OSD
lei xin wrote:
> I ran into the same problem and my trigger condition was the same, i.e. when configuring the VIP o...
lei xin
02:33 PM Bug #43413: Virtual IP address of iface lo results in failing to start an OSD
I ran into the same problem and my trigger condition was the same, i.e. when configuring the VIP on the loopback int... lei xin

07/31/2020

09:56 PM Feature #39012 (In Progress): osd: distinguish unfound + impossible to find, vs start some down O...
David Zafman
12:35 PM Bug #46445 (Pending Backport): nautilis client may hunt for mon very long if msg v2 is not enable...
Kefu Chai
12:29 PM Bug #46323: thrash_cache_writeback_proxy_none: FAILED ceph_assert(version == old_value.version) i...
/a/kchai-2020-07-31_01:42:48-rados-wip-kefu-testing-2020-07-30-2107-distro-basic-smithi/5271969 Kefu Chai
11:34 AM Bug #38322 (Closed): luminous: mons do not trim maps until restarted
Joao Eduardo Luis
11:34 AM Bug #38322 (Resolved): luminous: mons do not trim maps until restarted
Joao Eduardo Luis
11:32 AM Support #8600 (Closed): MON crashes on new crushmap injection
closing because no one has complained for 6 years. Joao Eduardo Luis
10:33 AM Bug #44755: Create stronger affinity between drivegroup specs and osd daemons
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
10:32 AM Bug #46224 (Resolved): Health check failed: 4 mgr modules have failed (MGR_MODULE_ERROR)
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
09:34 AM Feature #38603 (Resolved): mon: osdmap prune
luminous is EOL Joao Eduardo Luis
09:32 AM Backport #38610 (Rejected): luminous: mon: osdmap prune
luminous is EOL and the backport PR has been closed Nathan Cutler
09:15 AM Fix #6496 (Closed): mon: PGMap::dump should use TextTable
Joao Eduardo Luis
09:13 AM Bug #18859 (Closed): kraken monitor fails to bootstrap off jewel monitors if it has booted before
Joao Eduardo Luis
09:12 AM Bug #18043 (Closed): ceph-mon prioritizes public_network over mon_host address
Joao Eduardo Luis
06:42 AM Backport #46741 (Resolved): nautilus: ceph_osd crash in _committed_osd_maps when failed to encode...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36339
m...
Nathan Cutler

07/30/2020

11:59 PM Backport #46741: nautilus: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36339
merged
Yuri Weinstein
08:02 PM Bug #46732: teuthology.exceptions.MaxWhileTries: 'check for active or peered' reached maximum tri...
... Neha Ojha
11:35 AM Bug #46318 (Triaged): mon_recovery: quorum_status times out
Joao Eduardo Luis
11:32 AM Bug #46428: mon: all the 3 mon daemons crashed when running the fs aio test
Are you co-locating the test and the monitors? Can this be fd depletion? Joao Eduardo Luis
05:17 AM Backport #46408 (Resolved): octopus: Health check failed: 4 mgr modules have failed (MGR_MODULE_E...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35995
m...
Nathan Cutler
05:15 AM Backport #46372 (Resolved): osd: expose osdspec_affinity to osd_metadata
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35957
m...
Nathan Cutler

07/29/2020

05:35 PM Documentation #46760 (Fix Under Review): The default value of osd_op_queue is wpq since v11.0.0
Neha Ojha
03:36 PM Documentation #46760: The default value of osd_op_queue is wpq since v11.0.0
https://github.com/ceph/ceph/pull/36354 Benoît Knecht
03:32 PM Documentation #46760 (Fix Under Review): The default value of osd_op_queue is wpq since v11.0.0
Since 14adc9d33f, `osd_op_queue` defaults to `wpq`, but the documentation was still stating that its default value is... Benoît Knecht
04:53 AM Bug #46732 (Need More Info): teuthology.exceptions.MaxWhileTries: 'check for active or peered' re...
Looks like osd.2 was taken down by the thrasher and did not come back up. We'd probably need a full set of logs to wo... Brad Hubbard
04:31 AM Backport #46742 (In Progress): octopus: ceph_osd crash in _committed_osd_maps when failed to enco...
Nathan Cutler
04:30 AM Backport #46742 (Resolved): octopus: ceph_osd crash in _committed_osd_maps when failed to encode ...
https://github.com/ceph/ceph/pull/36340 Nathan Cutler
04:31 AM Bug #45991 (Resolved): PG merge: FAILED ceph_assert(info.history.same_interval_since != 0)
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
04:30 AM Backport #46741 (In Progress): nautilus: ceph_osd crash in _committed_osd_maps when failed to enc...
Nathan Cutler
04:29 AM Backport #46741 (Resolved): nautilus: ceph_osd crash in _committed_osd_maps when failed to encode...
https://github.com/ceph/ceph/pull/36339 Nathan Cutler
04:19 AM Backport #46706 (Resolved): nautilus: Cancellation of on-going scrubs
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36292
m...
Nathan Cutler
04:19 AM Backport #46090 (Resolved): nautilus: PG merge: FAILED ceph_assert(info.history.same_interval_sin...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36161
m...
Nathan Cutler
01:11 AM Bug #46443 (Pending Backport): ceph_osd crash in _committed_osd_maps when failed to encode first ...
Neha Ojha

07/28/2020

05:35 PM Backport #46706: nautilus: Cancellation of on-going scrubs
David Zafman wrote:
> https://github.com/ceph/ceph/pull/36292
merged
Yuri Weinstein
05:34 PM Backport #46090: nautilus: PG merge: FAILED ceph_assert(info.history.same_interval_since != 0)
Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/36161
merged
Yuri Weinstein
03:32 PM Backport #46739 (Resolved): octopus: mon: expected_num_objects warning triggers on bluestore-only...
https://github.com/ceph/ceph/pull/36665 Nathan Cutler
03:32 PM Backport #46738 (Resolved): nautilus: mon: expected_num_objects warning triggers on bluestore-onl...
https://github.com/ceph/ceph/pull/37474 Nathan Cutler
02:59 PM Backport #46408: octopus: Health check failed: 4 mgr modules have failed (MGR_MODULE_ERROR)
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35995
merged
Yuri Weinstein
02:59 PM Backport #46372: osd: expose osdspec_affinity to osd_metadata
Joshua Schmid wrote:
> https://github.com/ceph/ceph/pull/35957
merged
Yuri Weinstein
01:43 PM Bug #23031: FAILED assert(!parent->get_log().get_missing().is_missing(soid))
hi, guys, what's the status of this problem now, does we resolve the assert in qa tests huang jun
04:39 AM Bug #46323: thrash_cache_writeback_proxy_none: FAILED ceph_assert(version == old_value.version) i...
/a/yuriw-2020-07-13_23:00:15-rados-wip-yuri8-testing-2020-07-13-1946-octopus-distro-basic-smithi/5224163 Brad Hubbard
04:03 AM Bug #46732 (Need More Info): teuthology.exceptions.MaxWhileTries: 'check for active or peered' re...
/a/yuriw-2020-07-13_23:00:15-rados-wip-yuri8-testing-2020-07-13-1946-octopus-distro-basic-smithi/5223971... Brad Hubbard
03:37 AM Bug #45615: api_watch_notify_pp: LibRadosWatchNotifyPPTests/LibRadosWatchNotifyPP.WatchNotify/1 f...
/a/yuriw-2020-07-13_23:00:15-rados-wip-yuri8-testing-2020-07-13-1946-octopus-distro-basic-smithi/5223919 Brad Hubbard
03:35 AM Bug #45423: api_tier_pp: [ FAILED ] LibRadosTwoPoolsPP.HitSetWrite
/ceph/teuthology-archive/yuriw-2020-07-13_23:00:15-rados-wip-yuri8-testing-2020-07-13-1946-octopus-distro-basic-smith... Brad Hubbard
03:27 AM Bug #27053: qa: thrashosds: "[ERR] : 2.0 has 1 objects unfound and apparently lost"
/a/yuriw-2020-07-13_23:00:15-rados-wip-yuri8-testing-2020-07-13-1946-octopus-distro-basic-smithi/5224148 Brad Hubbard
02:56 AM Bug #45318: Health check failed: 2/6 mons down, quorum b,a,c,e (MON_DOWN)" in cluster log running...
'msgr-failures/few', 'msgr/async-v1only', 'no_pools', 'objectstore/bluestore-comp-zlib', 'rados', 'rados/multimon/{cl... Brad Hubbard
02:50 AM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
/a/yuriw-2020-07-13_23:00:15-rados-wip-yuri8-testing-2020-07-13-1946-octopus-distro-basic-smithi/5224050 Brad Hubbard
02:21 AM Bug #37532 (Pending Backport): mon: expected_num_objects warning triggers on bluestore-only setups
Kefu Chai
02:17 AM Bug #46405: osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
/a/kchai-2020-07-27_15:50:48-rados-wip-kefu-testing-2020-07-27-2127-distro-basic-smithi/5261869 Kefu Chai

07/27/2020

06:41 PM Bug #27053: qa: thrashosds: "[ERR] : 2.0 has 1 objects unfound and apparently lost"
/ceph/teuthology-archive/pdonnell-2020-07-17_01:54:54-kcephfs-wip-pdonnell-testing-20200717.003135-distro-basic-smith... Patrick Donnelly
04:18 PM Bug #46443: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
update affected version as it impacted all octopus release Xiaoxi Chen
04:02 PM Bug #46443 (Fix Under Review): ceph_osd crash in _committed_osd_maps when failed to encode first ...
Neha Ojha
03:31 PM Bug #46443: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
Ahh now I understand why v14.2.10 crashes: fa842716b6dc3b2077e296d388c646f1605568b0 changed the `osdmap` in _committe... Dan van der Ster
11:57 AM Bug #46443: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
Maybe this will fix (untested -- use on a test cluster first):... Dan van der Ster
10:27 AM Bug #46443: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
The issue also persist in latest Octopus release.
Xiaoxi Chen
08:39 AM Bug #46443: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
I think it is mon not the peer OSD. (We just upgrade the mon from 14.2.10 to 15.2.4, below log with mon 15.2.4).
...
Xiaoxi Chen
07:52 AM Bug #46443: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
> For those osd cannot start ,it is 100% reproducible.
Could you set debug_ms = 1 on that osd, then inspect the lo...
Dan van der Ster
06:19 AM Bug #46443: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
Dan van der Ster wrote:
> @Xiaoxi thanks for confirming. What are the circumstances of your crash? Did it start spon...
Xiaoxi Chen
06:18 AM Bug #46443: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
@Dan
Yes/no, it is not 100% same that in our case we have several clusters that start adding OSDs with 14.2.10 into...
Xiaoxi Chen
11:35 AM Backport #46722 (Resolved): octopus: osd/osd-bench.sh 'tell osd.N bench' hang
https://github.com/ceph/ceph/pull/36664 Nathan Cutler
11:33 AM Bug #45561 (Resolved): rados/test_envlibrados_for_rocksdb.sh fails on Xenial (seen in nautilus)
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
11:32 AM Bug #46064 (Resolved): Add statfs output to ceph-objectstore-tool
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
11:30 AM Backport #46710 (Resolved): nautilus: Negative peer_num_objects crashes osd
https://github.com/ceph/ceph/pull/37473 Nathan Cutler
11:30 AM Backport #46709 (Resolved): octopus: Negative peer_num_objects crashes osd
https://github.com/ceph/ceph/pull/36663 Nathan Cutler

07/26/2020

08:04 PM Backport #46460 (Resolved): octopus: pybind/mgr/balancer: should use "==" and "!=" for comparing ...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36036
m...
Nathan Cutler
08:03 PM Backport #46116 (Resolved): nautilus: Add statfs output to ceph-objectstore-tool
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35713
m...
Nathan Cutler
08:02 PM Backport #45677 (Resolved): nautilus: rados/test_envlibrados_for_rocksdb.sh fails on Xenial (seen...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35237
m...
Nathan Cutler

07/25/2020

05:55 PM Bug #46705 (Pending Backport): Negative peer_num_objects crashes osd
Kefu Chai
12:45 AM Bug #46705 (Resolved): Negative peer_num_objects crashes osd
https://pulpito.ceph.com/xxg-2020-07-20_02:56:08-rados:thrash-nautilus-lie-distro-basic-smithi/5240518/
Full stack...
xie xingguo
04:49 PM Backport #46706 (In Progress): nautilus: Cancellation of on-going scrubs
David Zafman
04:09 PM Backport #46706 (Resolved): nautilus: Cancellation of on-going scrubs
https://github.com/ceph/ceph/pull/36292 David Zafman
04:17 PM Backport #46707 (In Progress): octopus: Cancellation of on-going scrubs
David Zafman
04:10 PM Backport #46707 (Resolved): octopus: Cancellation of on-going scrubs
https://github.com/ceph/ceph/pull/36291 David Zafman
03:57 PM Bug #46275 (Pending Backport): Cancellation of on-going scrubs
David Zafman

07/24/2020

09:29 PM Bug #46405: osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
I'm not seeing this on my build machine using run-standalone.sh David Zafman
07:11 PM Backport #46116: nautilus: Add statfs output to ceph-objectstore-tool
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35713
merged
Yuri Weinstein
07:08 PM Backport #45677: nautilus: rados/test_envlibrados_for_rocksdb.sh fails on Xenial (seen in nautilus)
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35237
merged
Yuri Weinstein
06:21 AM Bug #46443: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
@Xiaoxi thanks for confirming. What are the circumstances of your crash? Did it start spontaneously after you upgrade... Dan van der Ster
06:16 AM Bug #46443: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
I do have coredump captured , the osdmap is null which lead to segmentation fault in osdmap->isup Xiaoxi Chen

07/23/2020

02:37 PM Bug #43888 (Pending Backport): osd/osd-bench.sh 'tell osd.N bench' hang
Neha Ojha
04:34 AM Bug #46428: mon: all the 3 mon daemons crashed when running the fs aio test
The steps:
1, mount one cephfs kernel client to /mnt/cephfs/
2, run the following command:...
Xiubo Li
04:32 AM Bug #46428: mon: all the 3 mon daemons crashed when running the fs aio test
I couldn't reproduce it locally, let the core team help to check the above core dump whether they have any idea about... Xiubo Li

07/22/2020

10:00 PM Bug #43888 (Fix Under Review): osd/osd-bench.sh 'tell osd.N bench' hang
More details in https://gist.github.com/aclamk/fac791df3510840c640e18a0e6a4c724 Neha Ojha
07:55 PM Bug #46275 (Fix Under Review): Cancellation of on-going scrubs
David Zafman
04:02 PM Backport #46460: octopus: pybind/mgr/balancer: should use "==" and "!=" for comparing strings
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36036
merged
Yuri Weinstein
03:52 PM Bug #46670 (New): refuse to remove mon from the monmap if the mon is in quorum
Before accepting to remove the mon when "ceph mon remove" is used, we must not acknowledge the request if the mon is ... Sébastien Han

07/21/2020

06:36 PM Feature #46663 (Resolved): Add pg count for pools in the `ceph df` command
Add pg count for pools in the `ceph df` command Vikhyat Umrao
06:31 PM Bug #43174: pgs inconsistent, union_shard_errors=missing
https://github.com/ceph/ceph/pull/35938 is closed in favor of https://github.com/ceph/ceph/pull/36230 Mykola Golub
02:17 AM Bug #46428 (In Progress): mon: all the 3 mon daemons crashed when running the fs aio test
Xiubo Li

07/20/2020

05:39 PM Bug #46242: rados -p default.rgw.buckets.data returning over millions objects No such file or dir...
I think first you should verify the correct name as Josh suggested with `rados stat` command.
For example I did tr...
Vikhyat Umrao
03:19 PM Bug #43861 (Resolved): ceph_test_rados_watch_notify hang
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
03:16 PM Bug #46143 (Resolved): osd: make message cap option usable again
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
01:21 AM Bug #46443: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
Initially there's a crc error building the full from the first incremental in the loop:... Dan van der Ster

07/19/2020

06:20 PM Bug #43174 (In Progress): pgs inconsistent, union_shard_errors=missing
David Zafman
06:03 PM Bug #46562 (Rejected): ceph tell PGID scrub/deep_scrub stopped working
This wasn't really the problem I was seeing. David Zafman

07/17/2020

06:10 PM Bug #46596 (Resolved): ceph-osd --mkfs: *** longjmp causes uninitialized stack frame ***: /usr/bi...
Kefu Chai
03:47 PM Bug #46596 (Fix Under Review): ceph-osd --mkfs: *** longjmp causes uninitialized stack frame ***:...
https://github.com/ceph/ceph-container/pull/1712 Kefu Chai
11:53 AM Bug #46596: ceph-osd --mkfs: *** longjmp causes uninitialized stack frame ***: /usr/bin/ceph-osd ...
there is a small possibility that this is related to https://github.com/ceph/ceph/pull/33770 Sebastian Wagner
11:24 AM Bug #46596 (Resolved): ceph-osd --mkfs: *** longjmp causes uninitialized stack frame ***: /usr/bi...
This is likely a regression that was merged yesterday into master (July 16th).... Sebastian Wagner
05:56 PM Backport #46017 (Resolved): nautilus: ceph_test_rados_watch_notify hang
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36031
m...
Nathan Cutler
05:39 PM Backport #46017: nautilus: ceph_test_rados_watch_notify hang
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36031
merged
Yuri Weinstein
05:56 PM Backport #46164 (Resolved): nautilus: osd: make message cap option usable again
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35738
m...
Nathan Cutler
05:38 PM Backport #46164: nautilus: osd: make message cap option usable again
Neha Ojha wrote:
> https://github.com/ceph/ceph/pull/35738
merged
Yuri Weinstein
05:27 PM Bug #46603 (New): osd/osd-backfill-space.sh: TEST_ec_backfill_simple: return 1
... Neha Ojha
04:16 PM Backport #46090 (In Progress): nautilus: PG merge: FAILED ceph_assert(info.history.same_interval_...
Nathan Cutler
02:44 PM Bug #40081: mon: luminous crash attempting to decode maps after nautilus quorum has been formed
https://github.com/ceph/ceph/pull/28671 was closed
That PR was cherry-picked to nautilus via https://github.com/ce...
Nathan Cutler
11:17 AM Backport #46595 (Resolved): octopus: crash in Objecter and CRUSH map lookup
https://github.com/ceph/ceph/pull/36662 Nathan Cutler
11:17 AM Bug #44314 (Resolved): osd-backfill-stats.sh failing intermittently in TEST_backfill_sizeup_out()...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
11:16 AM Bug #45606 (Resolved): build_incremental_map_msg missing incremental map while snaptrim or backfi...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
11:16 AM Bug #45733 (Resolved): osd-scrub-repair.sh: SyntaxError: invalid syntax
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
11:16 AM Bug #45795 (Resolved): PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().e...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
11:16 AM Bug #45943 (Resolved): Ceph Monitor heartbeat grace period does not reset.
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
11:15 AM Bug #46053 (Resolved): osd: wakeup all threads of shard rather than one thread
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
11:14 AM Backport #46587 (Resolved): nautilus: The default value of osd_scrub_during_recovery is false sin...
https://github.com/ceph/ceph/pull/37472 Nathan Cutler
11:14 AM Backport #46586 (Resolved): octopus: The default value of osd_scrub_during_recovery is false sinc...
https://github.com/ceph/ceph/pull/36661 Nathan Cutler
11:13 AM Backport #45890 (Resolved): nautilus: osd: pg stuck in waitactingchange when new acting set doesn...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35389
m...
Nathan Cutler
11:13 AM Backport #45883 (Resolved): nautilus: osd-scrub-repair.sh: SyntaxError: invalid syntax
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35388
m...
Nathan Cutler
11:12 AM Backport #45776 (Resolved): nautilus: build_incremental_map_msg missing incremental map while sna...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35386
m...
Nathan Cutler
11:09 AM Backport #46286 (Resolved): octopus: mon: log entry with garbage generated by bad memory access
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36035
m...
Nathan Cutler
11:06 AM Backport #46261 (Resolved): octopus: larger osd_scrub_max_preemptions values cause Floating point...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36034
m...
Nathan Cutler
11:06 AM Backport #46089 (Resolved): octopus: PG merge: FAILED ceph_assert(info.history.same_interval_sinc...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36033
m...
Nathan Cutler
11:05 AM Backport #46086 (Resolved): octopus: osd: wakeup all threads of shard rather than one thread
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36032
m...
Nathan Cutler
11:05 AM Backport #46016 (Resolved): octopus: osd-backfill-stats.sh failing intermittently in TEST_backfil...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36030
m...
Nathan Cutler
11:05 AM Backport #46007 (Resolved): octopus: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recover...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36029
m...
Nathan Cutler
 

Also available in: Atom