Project

General

Profile

Activity

From 07/01/2020 to 07/30/2020

07/30/2020

11:59 PM Backport #46741: nautilus: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36339
merged
Yuri Weinstein
08:02 PM Bug #46732: teuthology.exceptions.MaxWhileTries: 'check for active or peered' reached maximum tri...
... Neha Ojha
11:35 AM Bug #46318 (Triaged): mon_recovery: quorum_status times out
Joao Eduardo Luis
11:32 AM Bug #46428: mon: all the 3 mon daemons crashed when running the fs aio test
Are you co-locating the test and the monitors? Can this be fd depletion? Joao Eduardo Luis
05:17 AM Backport #46408 (Resolved): octopus: Health check failed: 4 mgr modules have failed (MGR_MODULE_E...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35995
m...
Nathan Cutler
05:15 AM Backport #46372 (Resolved): osd: expose osdspec_affinity to osd_metadata
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35957
m...
Nathan Cutler

07/29/2020

05:35 PM Documentation #46760 (Fix Under Review): The default value of osd_op_queue is wpq since v11.0.0
Neha Ojha
03:36 PM Documentation #46760: The default value of osd_op_queue is wpq since v11.0.0
https://github.com/ceph/ceph/pull/36354 Benoît Knecht
03:32 PM Documentation #46760 (Fix Under Review): The default value of osd_op_queue is wpq since v11.0.0
Since 14adc9d33f, `osd_op_queue` defaults to `wpq`, but the documentation was still stating that its default value is... Benoît Knecht
04:53 AM Bug #46732 (Need More Info): teuthology.exceptions.MaxWhileTries: 'check for active or peered' re...
Looks like osd.2 was taken down by the thrasher and did not come back up. We'd probably need a full set of logs to wo... Brad Hubbard
04:31 AM Backport #46742 (In Progress): octopus: ceph_osd crash in _committed_osd_maps when failed to enco...
Nathan Cutler
04:30 AM Backport #46742 (Resolved): octopus: ceph_osd crash in _committed_osd_maps when failed to encode ...
https://github.com/ceph/ceph/pull/36340 Nathan Cutler
04:31 AM Bug #45991 (Resolved): PG merge: FAILED ceph_assert(info.history.same_interval_since != 0)
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
04:30 AM Backport #46741 (In Progress): nautilus: ceph_osd crash in _committed_osd_maps when failed to enc...
Nathan Cutler
04:29 AM Backport #46741 (Resolved): nautilus: ceph_osd crash in _committed_osd_maps when failed to encode...
https://github.com/ceph/ceph/pull/36339 Nathan Cutler
04:19 AM Backport #46706 (Resolved): nautilus: Cancellation of on-going scrubs
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36292
m...
Nathan Cutler
04:19 AM Backport #46090 (Resolved): nautilus: PG merge: FAILED ceph_assert(info.history.same_interval_sin...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36161
m...
Nathan Cutler
01:11 AM Bug #46443 (Pending Backport): ceph_osd crash in _committed_osd_maps when failed to encode first ...
Neha Ojha

07/28/2020

05:35 PM Backport #46706: nautilus: Cancellation of on-going scrubs
David Zafman wrote:
> https://github.com/ceph/ceph/pull/36292
merged
Yuri Weinstein
05:34 PM Backport #46090: nautilus: PG merge: FAILED ceph_assert(info.history.same_interval_since != 0)
Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/36161
merged
Yuri Weinstein
03:32 PM Backport #46739 (Resolved): octopus: mon: expected_num_objects warning triggers on bluestore-only...
https://github.com/ceph/ceph/pull/36665 Nathan Cutler
03:32 PM Backport #46738 (Resolved): nautilus: mon: expected_num_objects warning triggers on bluestore-onl...
https://github.com/ceph/ceph/pull/37474 Nathan Cutler
02:59 PM Backport #46408: octopus: Health check failed: 4 mgr modules have failed (MGR_MODULE_ERROR)
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35995
merged
Yuri Weinstein
02:59 PM Backport #46372: osd: expose osdspec_affinity to osd_metadata
Joshua Schmid wrote:
> https://github.com/ceph/ceph/pull/35957
merged
Yuri Weinstein
01:43 PM Bug #23031: FAILED assert(!parent->get_log().get_missing().is_missing(soid))
hi, guys, what's the status of this problem now, does we resolve the assert in qa tests huang jun
04:39 AM Bug #46323: thrash_cache_writeback_proxy_none: FAILED ceph_assert(version == old_value.version) i...
/a/yuriw-2020-07-13_23:00:15-rados-wip-yuri8-testing-2020-07-13-1946-octopus-distro-basic-smithi/5224163 Brad Hubbard
04:03 AM Bug #46732 (Need More Info): teuthology.exceptions.MaxWhileTries: 'check for active or peered' re...
/a/yuriw-2020-07-13_23:00:15-rados-wip-yuri8-testing-2020-07-13-1946-octopus-distro-basic-smithi/5223971... Brad Hubbard
03:37 AM Bug #45615: api_watch_notify_pp: LibRadosWatchNotifyPPTests/LibRadosWatchNotifyPP.WatchNotify/1 f...
/a/yuriw-2020-07-13_23:00:15-rados-wip-yuri8-testing-2020-07-13-1946-octopus-distro-basic-smithi/5223919 Brad Hubbard
03:35 AM Bug #45423: api_tier_pp: [ FAILED ] LibRadosTwoPoolsPP.HitSetWrite
/ceph/teuthology-archive/yuriw-2020-07-13_23:00:15-rados-wip-yuri8-testing-2020-07-13-1946-octopus-distro-basic-smith... Brad Hubbard
03:27 AM Bug #27053: qa: thrashosds: "[ERR] : 2.0 has 1 objects unfound and apparently lost"
/a/yuriw-2020-07-13_23:00:15-rados-wip-yuri8-testing-2020-07-13-1946-octopus-distro-basic-smithi/5224148 Brad Hubbard
02:56 AM Bug #45318: Health check failed: 2/6 mons down, quorum b,a,c,e (MON_DOWN)" in cluster log running...
'msgr-failures/few', 'msgr/async-v1only', 'no_pools', 'objectstore/bluestore-comp-zlib', 'rados', 'rados/multimon/{cl... Brad Hubbard
02:50 AM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
/a/yuriw-2020-07-13_23:00:15-rados-wip-yuri8-testing-2020-07-13-1946-octopus-distro-basic-smithi/5224050 Brad Hubbard
02:21 AM Bug #37532 (Pending Backport): mon: expected_num_objects warning triggers on bluestore-only setups
Kefu Chai
02:17 AM Bug #46405: osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
/a/kchai-2020-07-27_15:50:48-rados-wip-kefu-testing-2020-07-27-2127-distro-basic-smithi/5261869 Kefu Chai

07/27/2020

06:41 PM Bug #27053: qa: thrashosds: "[ERR] : 2.0 has 1 objects unfound and apparently lost"
/ceph/teuthology-archive/pdonnell-2020-07-17_01:54:54-kcephfs-wip-pdonnell-testing-20200717.003135-distro-basic-smith... Patrick Donnelly
04:18 PM Bug #46443: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
update affected version as it impacted all octopus release Xiaoxi Chen
04:02 PM Bug #46443 (Fix Under Review): ceph_osd crash in _committed_osd_maps when failed to encode first ...
Neha Ojha
03:31 PM Bug #46443: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
Ahh now I understand why v14.2.10 crashes: fa842716b6dc3b2077e296d388c646f1605568b0 changed the `osdmap` in _committe... Dan van der Ster
11:57 AM Bug #46443: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
Maybe this will fix (untested -- use on a test cluster first):... Dan van der Ster
10:27 AM Bug #46443: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
The issue also persist in latest Octopus release.
Xiaoxi Chen
08:39 AM Bug #46443: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
I think it is mon not the peer OSD. (We just upgrade the mon from 14.2.10 to 15.2.4, below log with mon 15.2.4).
...
Xiaoxi Chen
07:52 AM Bug #46443: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
> For those osd cannot start ,it is 100% reproducible.
Could you set debug_ms = 1 on that osd, then inspect the lo...
Dan van der Ster
06:19 AM Bug #46443: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
Dan van der Ster wrote:
> @Xiaoxi thanks for confirming. What are the circumstances of your crash? Did it start spon...
Xiaoxi Chen
06:18 AM Bug #46443: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
@Dan
Yes/no, it is not 100% same that in our case we have several clusters that start adding OSDs with 14.2.10 into...
Xiaoxi Chen
11:35 AM Backport #46722 (Resolved): octopus: osd/osd-bench.sh 'tell osd.N bench' hang
https://github.com/ceph/ceph/pull/36664 Nathan Cutler
11:33 AM Bug #45561 (Resolved): rados/test_envlibrados_for_rocksdb.sh fails on Xenial (seen in nautilus)
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
11:32 AM Bug #46064 (Resolved): Add statfs output to ceph-objectstore-tool
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
11:30 AM Backport #46710 (Resolved): nautilus: Negative peer_num_objects crashes osd
https://github.com/ceph/ceph/pull/37473 Nathan Cutler
11:30 AM Backport #46709 (Resolved): octopus: Negative peer_num_objects crashes osd
https://github.com/ceph/ceph/pull/36663 Nathan Cutler

07/26/2020

08:04 PM Backport #46460 (Resolved): octopus: pybind/mgr/balancer: should use "==" and "!=" for comparing ...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36036
m...
Nathan Cutler
08:03 PM Backport #46116 (Resolved): nautilus: Add statfs output to ceph-objectstore-tool
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35713
m...
Nathan Cutler
08:02 PM Backport #45677 (Resolved): nautilus: rados/test_envlibrados_for_rocksdb.sh fails on Xenial (seen...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35237
m...
Nathan Cutler

07/25/2020

05:55 PM Bug #46705 (Pending Backport): Negative peer_num_objects crashes osd
Kefu Chai
12:45 AM Bug #46705 (Resolved): Negative peer_num_objects crashes osd
https://pulpito.ceph.com/xxg-2020-07-20_02:56:08-rados:thrash-nautilus-lie-distro-basic-smithi/5240518/
Full stack...
xie xingguo
04:49 PM Backport #46706 (In Progress): nautilus: Cancellation of on-going scrubs
David Zafman
04:09 PM Backport #46706 (Resolved): nautilus: Cancellation of on-going scrubs
https://github.com/ceph/ceph/pull/36292 David Zafman
04:17 PM Backport #46707 (In Progress): octopus: Cancellation of on-going scrubs
David Zafman
04:10 PM Backport #46707 (Resolved): octopus: Cancellation of on-going scrubs
https://github.com/ceph/ceph/pull/36291 David Zafman
03:57 PM Bug #46275 (Pending Backport): Cancellation of on-going scrubs
David Zafman

07/24/2020

09:29 PM Bug #46405: osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
I'm not seeing this on my build machine using run-standalone.sh David Zafman
07:11 PM Backport #46116: nautilus: Add statfs output to ceph-objectstore-tool
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35713
merged
Yuri Weinstein
07:08 PM Backport #45677: nautilus: rados/test_envlibrados_for_rocksdb.sh fails on Xenial (seen in nautilus)
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35237
merged
Yuri Weinstein
06:21 AM Bug #46443: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
@Xiaoxi thanks for confirming. What are the circumstances of your crash? Did it start spontaneously after you upgrade... Dan van der Ster
06:16 AM Bug #46443: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
I do have coredump captured , the osdmap is null which lead to segmentation fault in osdmap->isup Xiaoxi Chen

07/23/2020

02:37 PM Bug #43888 (Pending Backport): osd/osd-bench.sh 'tell osd.N bench' hang
Neha Ojha
04:34 AM Bug #46428: mon: all the 3 mon daemons crashed when running the fs aio test
The steps:
1, mount one cephfs kernel client to /mnt/cephfs/
2, run the following command:...
Xiubo Li
04:32 AM Bug #46428: mon: all the 3 mon daemons crashed when running the fs aio test
I couldn't reproduce it locally, let the core team help to check the above core dump whether they have any idea about... Xiubo Li

07/22/2020

10:00 PM Bug #43888 (Fix Under Review): osd/osd-bench.sh 'tell osd.N bench' hang
More details in https://gist.github.com/aclamk/fac791df3510840c640e18a0e6a4c724 Neha Ojha
07:55 PM Bug #46275 (Fix Under Review): Cancellation of on-going scrubs
David Zafman
04:02 PM Backport #46460: octopus: pybind/mgr/balancer: should use "==" and "!=" for comparing strings
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36036
merged
Yuri Weinstein
03:52 PM Bug #46670 (New): refuse to remove mon from the monmap if the mon is in quorum
Before accepting to remove the mon when "ceph mon remove" is used, we must not acknowledge the request if the mon is ... Sébastien Han

07/21/2020

06:36 PM Feature #46663 (Resolved): Add pg count for pools in the `ceph df` command
Add pg count for pools in the `ceph df` command Vikhyat Umrao
06:31 PM Bug #43174: pgs inconsistent, union_shard_errors=missing
https://github.com/ceph/ceph/pull/35938 is closed in favor of https://github.com/ceph/ceph/pull/36230 Mykola Golub
02:17 AM Bug #46428 (In Progress): mon: all the 3 mon daemons crashed when running the fs aio test
Xiubo Li

07/20/2020

05:39 PM Bug #46242: rados -p default.rgw.buckets.data returning over millions objects No such file or dir...
I think first you should verify the correct name as Josh suggested with `rados stat` command.
For example I did tr...
Vikhyat Umrao
03:19 PM Bug #43861 (Resolved): ceph_test_rados_watch_notify hang
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
03:16 PM Bug #46143 (Resolved): osd: make message cap option usable again
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
01:21 AM Bug #46443: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
Initially there's a crc error building the full from the first incremental in the loop:... Dan van der Ster

07/19/2020

06:20 PM Bug #43174 (In Progress): pgs inconsistent, union_shard_errors=missing
David Zafman
06:03 PM Bug #46562 (Rejected): ceph tell PGID scrub/deep_scrub stopped working
This wasn't really the problem I was seeing. David Zafman

07/17/2020

06:10 PM Bug #46596 (Resolved): ceph-osd --mkfs: *** longjmp causes uninitialized stack frame ***: /usr/bi...
Kefu Chai
03:47 PM Bug #46596 (Fix Under Review): ceph-osd --mkfs: *** longjmp causes uninitialized stack frame ***:...
https://github.com/ceph/ceph-container/pull/1712 Kefu Chai
11:53 AM Bug #46596: ceph-osd --mkfs: *** longjmp causes uninitialized stack frame ***: /usr/bin/ceph-osd ...
there is a small possibility that this is related to https://github.com/ceph/ceph/pull/33770 Sebastian Wagner
11:24 AM Bug #46596 (Resolved): ceph-osd --mkfs: *** longjmp causes uninitialized stack frame ***: /usr/bi...
This is likely a regression that was merged yesterday into master (July 16th).... Sebastian Wagner
05:56 PM Backport #46017 (Resolved): nautilus: ceph_test_rados_watch_notify hang
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36031
m...
Nathan Cutler
05:39 PM Backport #46017: nautilus: ceph_test_rados_watch_notify hang
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36031
merged
Yuri Weinstein
05:56 PM Backport #46164 (Resolved): nautilus: osd: make message cap option usable again
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35738
m...
Nathan Cutler
05:38 PM Backport #46164: nautilus: osd: make message cap option usable again
Neha Ojha wrote:
> https://github.com/ceph/ceph/pull/35738
merged
Yuri Weinstein
05:27 PM Bug #46603 (New): osd/osd-backfill-space.sh: TEST_ec_backfill_simple: return 1
... Neha Ojha
04:16 PM Backport #46090 (In Progress): nautilus: PG merge: FAILED ceph_assert(info.history.same_interval_...
Nathan Cutler
02:44 PM Bug #40081: mon: luminous crash attempting to decode maps after nautilus quorum has been formed
https://github.com/ceph/ceph/pull/28671 was closed
That PR was cherry-picked to nautilus via https://github.com/ce...
Nathan Cutler
11:17 AM Backport #46595 (Resolved): octopus: crash in Objecter and CRUSH map lookup
https://github.com/ceph/ceph/pull/36662 Nathan Cutler
11:17 AM Bug #44314 (Resolved): osd-backfill-stats.sh failing intermittently in TEST_backfill_sizeup_out()...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
11:16 AM Bug #45606 (Resolved): build_incremental_map_msg missing incremental map while snaptrim or backfi...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
11:16 AM Bug #45733 (Resolved): osd-scrub-repair.sh: SyntaxError: invalid syntax
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
11:16 AM Bug #45795 (Resolved): PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().e...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
11:16 AM Bug #45943 (Resolved): Ceph Monitor heartbeat grace period does not reset.
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
11:15 AM Bug #46053 (Resolved): osd: wakeup all threads of shard rather than one thread
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
11:14 AM Backport #46587 (Resolved): nautilus: The default value of osd_scrub_during_recovery is false sin...
https://github.com/ceph/ceph/pull/37472 Nathan Cutler
11:14 AM Backport #46586 (Resolved): octopus: The default value of osd_scrub_during_recovery is false sinc...
https://github.com/ceph/ceph/pull/36661 Nathan Cutler
11:13 AM Backport #45890 (Resolved): nautilus: osd: pg stuck in waitactingchange when new acting set doesn...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35389
m...
Nathan Cutler
11:13 AM Backport #45883 (Resolved): nautilus: osd-scrub-repair.sh: SyntaxError: invalid syntax
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35388
m...
Nathan Cutler
11:12 AM Backport #45776 (Resolved): nautilus: build_incremental_map_msg missing incremental map while sna...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35386
m...
Nathan Cutler
11:09 AM Backport #46286 (Resolved): octopus: mon: log entry with garbage generated by bad memory access
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36035
m...
Nathan Cutler
11:06 AM Backport #46261 (Resolved): octopus: larger osd_scrub_max_preemptions values cause Floating point...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36034
m...
Nathan Cutler
11:06 AM Backport #46089 (Resolved): octopus: PG merge: FAILED ceph_assert(info.history.same_interval_sinc...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36033
m...
Nathan Cutler
11:05 AM Backport #46086 (Resolved): octopus: osd: wakeup all threads of shard rather than one thread
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36032
m...
Nathan Cutler
11:05 AM Backport #46016 (Resolved): octopus: osd-backfill-stats.sh failing intermittently in TEST_backfil...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36030
m...
Nathan Cutler
11:05 AM Backport #46007 (Resolved): octopus: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recover...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36029
m...
Nathan Cutler

07/16/2020

05:09 PM Backport #45890: nautilus: osd: pg stuck in waitactingchange when new acting set doesn't change
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35389
merged
Yuri Weinstein
05:08 PM Backport #45883: nautilus: osd-scrub-repair.sh: SyntaxError: invalid syntax
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35388
merged
Yuri Weinstein
05:08 PM Backport #45776: nautilus: build_incremental_map_msg missing incremental map while snaptrim or ba...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35386
merged
Yuri Weinstein
04:30 PM Backport #46286: octopus: mon: log entry with garbage generated by bad memory access
Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/36035
merged
Yuri Weinstein
01:14 PM Bug #46443: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
Markus do you have a coredump available for further debugging? Dan van der Ster
12:52 PM Bug #43365: Nautilus: Random mon crashes in failed assertion at ceph::time_detail::signedspan
So, we experienced this bug as well, so I investigated it myself, though I'm no timekeeping, ceph, or c++ expert, jus... Anonymous
11:36 AM Documentation #46554 (Resolved): Malformed sentence in RADOS page
Zac Dover
09:54 AM Bug #42668 (Won't Fix): ceph daemon osd.* fails in osd container but ceph daemon mds.* does not f...
Ben, just run `unset CEPH_ARGS` once in the OSD container, then you will be able to use the socket commands. Sébastien Han
09:23 AM Bug #44311 (Pending Backport): crash in Objecter and CRUSH map lookup
Kefu Chai
02:38 AM Bug #46562 (Rejected): ceph tell PGID scrub/deep_scrub stopped working

At this point I don't know if was broken by https://github.com/ceph/ceph/pull/30217 change.
David Zafman

07/15/2020

09:30 PM Bug #46405: osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
/a/yuriw-2020-07-13_23:06:23-rados-wip-yuri5-testing-2020-07-13-1944-octopus-distro-basic-smithi/5224649 Neha Ojha
04:36 PM Documentation #46554: Malformed sentence in RADOS page
Item 22 here:
https://pad.ceph.com/p/Report_Documentation_Bugs
Zac Dover
04:32 PM Documentation #46554 (Resolved): Malformed sentence in RADOS page
https://docs.ceph.com/docs/master/rados/
The current sentence:
Once you have a deployed a Ceph Storage Clust...
Zac Dover
03:58 PM Documentation #45988 (Resolved): [doc/os]: Centos 8 is not listed even though it is supported
Zac Dover
11:09 AM Documentation #46545 (New): Two Developer Guide pages might be redundant
https://docs.ceph.com/docs/master/dev/internals/
and
https://docs.ceph.com/docs/master/dev/developer_guide/
...
Zac Dover
10:58 AM Documentation #46531 (Pending Backport): The default value of osd_scrub_during_recovery is false ...
Kefu Chai
10:24 AM Bug #46323: thrash_cache_writeback_proxy_none: FAILED ceph_assert(version == old_value.version) i...
/a//kchai-2020-07-15_09:19:03-rados-wip-kefu-testing-2020-07-13-2108-distro-basic-smithi/5228761 Kefu Chai

07/14/2020

10:50 PM Bug #44311 (Fix Under Review): crash in Objecter and CRUSH map lookup
Jason Dillaman
04:43 PM Bug #44311: crash in Objecter and CRUSH map lookup
I am hitting this all the time now that librbd is using the 'neorados' API [1]. I plan to just rebuild the rmaps when... Jason Dillaman
04:41 PM Bug #44311 (In Progress): crash in Objecter and CRUSH map lookup
Jason Dillaman
07:12 PM Backport #46228 (Resolved): nautilus: Ceph Monitor heartbeat grace period does not reset.
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35798
m...
Nathan Cutler
04:22 PM Backport #46228: nautilus: Ceph Monitor heartbeat grace period does not reset.
Sridhar Seshasayee wrote:
> https://github.com/ceph/ceph/pull/35798
merged
Yuri Weinstein
07:02 PM Documentation #46531 (Fix Under Review): The default value of osd_scrub_during_recovery is false ...
Nathan Cutler
12:05 PM Documentation #46531: The default value of osd_scrub_during_recovery is false since v11.1.1
I can't figure out how to add the pull request ID above, so here's a link to it instead: https://github.com/ceph/ceph... Benoît Knecht
11:58 AM Documentation #46531 (Resolved): The default value of osd_scrub_during_recovery is false since v1...
Since 8dca17c, `osd_scrub_during_recovery` defaults to `false`, but the documentation was still stating that its defa... Benoît Knecht
06:58 PM Bug #37532 (Fix Under Review): mon: expected_num_objects warning triggers on bluestore-only setups
Nathan Cutler
08:58 AM Bug #37532: mon: expected_num_objects warning triggers on bluestore-only setups
Joao Eduardo Luis wrote:
> I don't think it's wise to simply remove the code because filestore is no longer the defa...
yunqing wang
04:19 PM Backport #46261: octopus: larger osd_scrub_max_preemptions values cause Floating point exception
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36034
merged
Yuri Weinstein
04:18 PM Backport #46089: octopus: PG merge: FAILED ceph_assert(info.history.same_interval_since != 0)
Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/36033
merged
Yuri Weinstein
04:18 PM Backport #46086: octopus: osd: wakeup all threads of shard rather than one thread
Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/36032
merged
Yuri Weinstein
04:17 PM Backport #46016: octopus: osd-backfill-stats.sh failing intermittently in TEST_backfill_sizeup_ou...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36030
merged
Yuri Weinstein
04:16 PM Backport #46007: octopus: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill(...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36029
merged
Yuri Weinstein
04:01 PM Bug #43888: osd/osd-bench.sh 'tell osd.N bench' hang
/a/yuriw-2020-07-13_19:30:53-rados-wip-yuri6-testing-2020-07-13-1520-octopus-distro-basic-smithi/5223525 Neha Ojha
01:20 PM Bug #17170: mon/monclient: update "unable to obtain rotating service keys when osd init" to sugge...
issue fixed after setting correct NTP server on the machines.
followed the instructions here: https://access.redhat....
Yuval Lifshitz
09:36 AM Bug #17170: mon/monclient: update "unable to obtain rotating service keys when osd init" to sugge...
issue still seen in pacific dev version:... Yuval Lifshitz

07/13/2020

07:32 PM Bug #46508: Health check failed: Reduced data availability: 1 pg inactive (PG_AVAILABILITY)
Neha Ojha wrote:
> Does not look related to https://tracker.ceph.com/issues/45619 or caused by https://github.com/ce...
Neha Ojha
07:30 PM Bug #46508: Health check failed: Reduced data availability: 1 pg inactive (PG_AVAILABILITY)
Does not look related to https://tracker.ceph.com/issues/45619 or caused by https://github.com/ceph/ceph/commit/d4fba... Neha Ojha
07:02 PM Bug #46508 (New): Health check failed: Reduced data availability: 1 pg inactive (PG_AVAILABILITY)
rados/basic/{ceph clusters/{fixed-2 openstack} msgr-failures/many msgr/async objectstore/bluestore-comp-snappy rados ... Neha Ojha
05:48 PM Bug #46506 (New): RuntimeError: Exiting scrub checking -- not all pgs scrubbed.
... Neha Ojha
01:20 PM Documentation #16356 (Resolved): doc: manual deployment of ceph monitor needs fix
https://github.com/ceph/ceph/pull/31452 resolves this issue. Zac Dover
11:19 AM Bug #46445 (Fix Under Review): nautilis client may hunt for mon very long if msg v2 is not enable...
Mykola Golub
05:04 AM Bug #46242: rados -p default.rgw.buckets.data returning over millions objects No such file or dir...
Hi Josh,
Object still exist:
Just done your test and failed with single quotes -> https://gyazo.com/a04fcf5b522...
Manuel Rios

07/10/2020

09:39 PM Bug #46443: ceph_osd crash in _committed_osd_maps when failed to encode first inc map
fa842716b6dc3b2077e296d388c646f1605568b0 arrived in v14.2.10 and touches _committed_osd_maps Dan van der Ster
07:42 AM Bug #46443 (Resolved): ceph_osd crash in _committed_osd_maps when failed to encode first inc map
We upgraded a mimic cluster to v14.2.10, everything was running and ok.
I triggerd an monmap change with the command...
Markus Binz
09:32 PM Backport #46460 (In Progress): octopus: pybind/mgr/balancer: should use "==" and "!=" for compari...
Nathan Cutler
05:50 PM Backport #46460 (Resolved): octopus: pybind/mgr/balancer: should use "==" and "!=" for comparing ...
https://github.com/ceph/ceph/pull/36036 Nathan Cutler
09:31 PM Backport #46286 (In Progress): octopus: mon: log entry with garbage generated by bad memory access
Nathan Cutler
09:30 PM Backport #46261 (In Progress): octopus: larger osd_scrub_max_preemptions values cause Floating po...
Nathan Cutler
09:29 PM Backport #46089 (In Progress): octopus: PG merge: FAILED ceph_assert(info.history.same_interval_s...
Nathan Cutler
09:28 PM Backport #46086 (In Progress): octopus: osd: wakeup all threads of shard rather than one thread
Nathan Cutler
09:27 PM Backport #46017 (In Progress): nautilus: ceph_test_rados_watch_notify hang
Nathan Cutler
09:26 PM Backport #46018 (Resolved): octopus: ceph_test_rados_watch_notify hang
The original fix went into octopus during the pre-release phase when bugfixes were being merged to octopus and octopu... Nathan Cutler
09:24 PM Backport #46016 (In Progress): octopus: osd-backfill-stats.sh failing intermittently in TEST_back...
Nathan Cutler
09:23 PM Backport #46007 (In Progress): octopus: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_reco...
Nathan Cutler
08:46 PM Bug #43888: osd/osd-bench.sh 'tell osd.N bench' hang
Couple of updates on this:
1. Reproduced the issue with some extra debug logging.
https://pulpito.ceph.com/no...
Neha Ojha
05:50 PM Backport #46461 (Resolved): nautilus: pybind/mgr/balancer: should use "==" and "!=" for comparing...
https://github.com/ceph/ceph/pull/37471 Nathan Cutler
08:20 AM Bug #46445 (Resolved): nautilis client may hunt for mon very long if msg v2 is not enabled on mons
The problem is observed for a nautilus client. For newer client versions the situation is accidentally much better (s... Mykola Golub
06:10 AM Backport #46229 (Resolved): octopus: Ceph Monitor heartbeat grace period does not reset.
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35799
m...
Nathan Cutler
06:09 AM Backport #46165 (Resolved): octopus: osd: make message cap option usable again
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35737
m...
Nathan Cutler

07/09/2020

07:08 PM Bug #46437 (Closed): Admin Socket leaves behind .asok files after daemons (ex: RGW) shut down gra...
Reproducer(s):
0. be in build dir
1. run vstart.sh
2. edit stop.sh to not `rm -rf "${asok_dir}"`
3. do ls of /tmp...
Ali Maredia
02:16 PM Backport #46408 (In Progress): octopus: Health check failed: 4 mgr modules have failed (MGR_MODUL...
Kefu Chai
02:34 AM Bug #46428 (In Progress): mon: all the 3 mon daemons crashed when running the fs aio test
The logs:... Xiubo Li

07/08/2020

09:19 PM Bug #46125: ceph mon memory increasing
You can attempt to use a lower target, it's not something we've tested much for the monitors. We expect the monitor t... Josh Durgin
09:17 PM Bug #46242: rados -p default.rgw.buckets.data returning over millions objects No such file or dir...
Can you verify the object name is passed to 'rados rm' correctly by enclosing it in single quotes?
Is it possible ...
Josh Durgin
07:34 PM Backport #46229: octopus: Ceph Monitor heartbeat grace period does not reset.
Sridhar Seshasayee wrote:
> https://github.com/ceph/ceph/pull/35799
merged
Yuri Weinstein
07:32 PM Backport #46165: octopus: osd: make message cap option usable again
Neha Ojha wrote:
> https://github.com/ceph/ceph/pull/35737
merged
Yuri Weinstein
06:49 PM Bug #46323: thrash_cache_writeback_proxy_none: FAILED ceph_assert(version == old_value.version) i...
rados/singleton/{all/thrash_cache_writeback_proxy_none msgr-failures/few msgr/async-v1only objectstore/bluestore-comp... Neha Ojha
06:44 PM Bug #46405: osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
Since the original feature is being backported to nautilus and octopus.
/a/yuriw-2020-07-06_17:23:10-rados-wip-yur...
Neha Ojha
03:27 PM Bug #46405: osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
https://pulpito.ceph.com/nojha-2020-07-08_01:02:55-rados:standalone-master-distro-basic-smithi/ Neha Ojha
01:05 AM Bug #46405 (Resolved): osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
... Neha Ojha
06:37 PM Bug #45647: "ceph --cluster ceph --log-early osd last-stat-seq osd.0" times out due to msgr-failu...
/a/yuriw-2020-07-06_19:37:47-rados-wip-yuri7-testing-2020-07-06-1754-octopus-distro-basic-smithi/5204335 Neha Ojha
06:21 PM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
/a/yuriw-2020-07-06_19:37:47-rados-wip-yuri7-testing-2020-07-06-1754-octopus-distro-basic-smithi/5204398 Neha Ojha
04:06 PM Bug #45139: osd/osd-markdown.sh: markdown_N_impl failure
/a/nojha-2020-07-08_01:02:55-rados:standalone-master-distro-basic-smithi/5207257 Neha Ojha
01:57 PM Documentation #46421 (In Progress): Add LoadBalancer Guide
I'm adding the LoadBalancer Guide, and I'm going to put a link to it on the Install Page.
In a future version of d...
Zac Dover
10:17 AM Backport #46408 (New): octopus: Health check failed: 4 mgr modules have failed (MGR_MODULE_ERROR)
Laura Paduano
07:46 AM Backport #46408 (In Progress): octopus: Health check failed: 4 mgr modules have failed (MGR_MODUL...
Laura Paduano
05:29 AM Backport #46408 (Resolved): octopus: Health check failed: 4 mgr modules have failed (MGR_MODULE_E...
https://github.com/ceph/ceph/pull/35995 Nathan Cutler
01:12 AM Bug #46224 (Pending Backport): Health check failed: 4 mgr modules have failed (MGR_MODULE_ERROR)
/a/yuriw-2020-07-06_19:37:47-rados-wip-yuri7-testing-2020-07-06-1754-octopus-distro-basic-smithi/5204440/ Neha Ojha

07/07/2020

11:17 AM Backport #46372 (In Progress): osd: expose osdspec_affinity to osd_metadata
Nathan Cutler
11:14 AM Backport #46372 (New): osd: expose osdspec_affinity to osd_metadata
Follow-up of https://github.com/ceph/ceph/pull/34835
Fixes: https://tracker.ceph.com/issues/44755
Nathan Cutler
11:15 AM Bug #44755: Create stronger affinity between drivegroup specs and osd daemons
Moving to RADOS project so it can be backported in the usual way. Nathan Cutler
06:56 AM Bug #46381 (New): pg down error and osd failed by :Objecter::_op_submit_with_budget
ENV:
LSB Version: :core-4.1-amd64:core-4.1-noarch
Distributor ID: CentOS
Description: CentOS Linux release 7...
Amine Liu

07/06/2020

10:29 PM Feature #46379: Add a force-scrub commands to bump already running scrubs
Maybe repair is always “force-repair”. It could be confusing to the user to do that. David Zafman
10:19 PM Feature #46379 (New): Add a force-scrub commands to bump already running scrubs

As it stands a user requested scrub gets first priority to start. However, if existing scrubs are already running,...
David Zafman
10:11 PM Feature #41363: Allow user to cancel scrub requests

Possible implementations:...
David Zafman
12:15 PM Backport #46372 (Duplicate): osd: expose osdspec_affinity to osd_metadata
Joshua Schmid
12:12 PM Backport #46372 (Resolved): osd: expose osdspec_affinity to osd_metadata
https://github.com/ceph/ceph/pull/35957 Joshua Schmid
10:24 AM Bug #43174: pgs inconsistent, union_shard_errors=missing
Our partners noticed that actually there is an issue with how the bluestore escapes the key strings. Here is their pa... Mykola Golub
05:43 AM Bug #43174: pgs inconsistent, union_shard_errors=missing
One of our customers also experienced this issue after adding bluestore osds to a filestore backed cluster.
Using ...
Mykola Golub

07/05/2020

12:05 PM Documentation #46361 (New): Update list of leads in the Developer Guide
https://docs.ceph.com/docs/master/dev/developer_guide/essentials/
Make sure that this list is up-to-date as of mid...
Zac Dover
12:02 PM Bug #46359 (Resolved): Install page has typo: s/suites/suits/
Zac Dover
08:26 AM Bug #46359 (Fix Under Review): Install page has typo: s/suites/suits/
Zac Dover
08:21 AM Bug #46359 (Resolved): Install page has typo: s/suites/suits/
https://docs.ceph.com/docs/master/install/
This page contains the following sentence:
Choose the method tha...
Zac Dover
02:44 AM Bug #46358 (Rejected): FAIL: test_full_health (tasks.mgr.dashboard.test_health.HealthTest)
https://github.com/ceph/ceph/pull/33827 has not been merged yet Kefu Chai
02:42 AM Bug #46358 (Rejected): FAIL: test_full_health (tasks.mgr.dashboard.test_health.HealthTest)
... Kefu Chai

07/03/2020

12:22 PM Backport #46115 (Resolved): octopus: Add statfs output to ceph-objectstore-tool
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35715
m...
Nathan Cutler
12:30 AM Bug #43888 (In Progress): osd/osd-bench.sh 'tell osd.N bench' hang
/a/dzafman-2020-06-08_11:45:40-rados-wip-zafman-testing-distro-basic-smithi/5130086
This is the command we care ab...
Neha Ojha

07/02/2020

04:46 PM Bug #46285 (Rejected): osd: error from smartctl is always reported as invalid JSON
turns out the report was from an earlier version (it did not contain the 'output' key) Josh Durgin
04:37 PM Bug #46179 (Duplicate): Health check failed: Reduced data availability: PG_AVAILABILITY
Neha Ojha
04:36 PM Bug #46225 (Duplicate): Health check failed: 1 osds down (OSD_DOWN)
Neha Ojha
01:35 PM Bug #46264: mon: check for mismatched daemon versions
I have completed a function called check_daemon_version located in src/mon/Monitor.cc This function goes through mon_... Tyler Sheehan
09:48 AM Bug #44755 (Pending Backport): Create stronger affinity between drivegroup specs and osd daemons
Sebastian Wagner
09:04 AM Bug #46178 (Duplicate): slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+know...
Ilya Dryomov
08:56 AM Bug #46180 (Resolved): qa: Scrubbing terminated -- not all pgs were active and clean.
Will be cherry-picked into https://github.com/ceph/ceph/pull/35720 and https://github.com/ceph/ceph/pull/35733. Ilya Dryomov

07/01/2020

10:55 PM Bug #46325 (Rejected): A pool at size 3 should have a min_size 2

The get_osd_pool_default_min_size() calculation of size - size/2 for the min_size should special case size 3 and ju...
David Zafman
10:03 PM Bug #37509 (Can't reproduce): require past_interval bounds mismatch due to osd oldest_map
Neha Ojha
09:58 PM Bug #23879 (Can't reproduce): test_mon_osdmap_prune.sh fails
Neha Ojha
09:57 PM Bug #23857 (Can't reproduce): flush (manifest) vs async recovery causes out of order op
Neha Ojha
09:56 PM Bug #23828 (Can't reproduce): ec gen object leaks into different filestore collection just after ...
Neha Ojha
09:53 PM Bug #23117: PGs stuck in "activating" after osd_max_pg_per_osd_hard_ratio has been exceeded once
We should try to make it more obvious when this limit is hit. I thought we added something in the cluster logs about ... Neha Ojha
09:49 PM Documentation #46324 (New): Sepia VPN Client Access documentation is out-of-date
https://wiki.sepia.ceph.com/doku.php?id=vpnaccess#vpn_client_access
There are two issues that I noticed that must ...
Zac Dover
09:49 PM Bug #20960 (Can't reproduce): ceph_test_rados: mismatched version (due to pg import/export)
The thrash_cache_writeback_proxy_none failure has a different root cause, opened a new tracker for it https://tracker... Neha Ojha
09:47 PM Bug #46323 (Resolved): thrash_cache_writeback_proxy_none: FAILED ceph_assert(version == old_value...
... Neha Ojha
09:35 PM Bug #19700 (Closed): OSD remained up despite cluster network being inactive?
Please reopen this bug if the issue is seen in nautilus or newer releases. Neha Ojha
09:22 PM Bug #43882 (Can't reproduce): osd to mon connection lost, osd stuck down
Neha Ojha
09:16 PM Bug #44631 (Can't reproduce): ceph pg dump error code 124
Neha Ojha
07:58 PM Bug #46275: Cancellation of on-going scrubs
We may be able to easily terminate scrubbing in between chunks if the noscrub/nodeep-scrub get set.
I will test this.
David Zafman
07:56 PM Bug #46275 (In Progress): Cancellation of on-going scrubs
David Zafman
07:32 PM Backport #46095 (Resolved): octopus: Issue health status warning if num_shards_repaired exceeds s...
Josh Durgin
07:22 PM Backport #46115: octopus: Add statfs output to ceph-objectstore-tool
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35715
merged
Yuri Weinstein
06:00 PM Bug #46318 (Need More Info): mon_recovery: quorum_status times out
... Neha Ojha
05:05 PM Bug #46285: osd: error from smartctl is always reported as invalid JSON
Which version is this cluster running?
I would expect to see this "output" key in the command's output:
https://g...
Yaarit Hatuka
02:43 AM Bug #46285 (Rejected): osd: error from smartctl is always reported as invalid JSON
When smartctl returns an error, the osd always reports it as invalid json. We meant to give a better error, but the c... Josh Durgin
02:51 AM Backport #46287 (Rejected): nautilus: mon: log entry with garbage generated by bad memory access
Patrick Donnelly
02:51 AM Backport #46286 (Resolved): octopus: mon: log entry with garbage generated by bad memory access
https://github.com/ceph/ceph/pull/36035 Patrick Donnelly
 

Also available in: Atom