Project

General

Profile

Activity

From 05/28/2020 to 06/26/2020

06/26/2020

07:27 PM Bug #46178: slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+known_if_redirec...
http://pulpito.ceph.com/mgfritch-2020-06-26_02:07:27-rados-wip-mgfritch-testing-2020-06-25-1855-distro-basic-smithi/... Michael Fritch
06:21 PM Bug #46180: qa: Scrubbing terminated -- not all pgs were active and clean.
Here's a reliable reproducer for the issue:
-s rados/singleton-nomsgr -c master --filter 'all/health-warnings rado...
Neha Ojha
06:50 AM Bug #46180: qa: Scrubbing terminated -- not all pgs were active and clean.
I think it has to do with reconnect handling and how connections are reused.
This part of ProtocolV2 is pretty fra...
Ilya Dryomov
05:04 AM Bug #46180: qa: Scrubbing terminated -- not all pgs were active and clean.
This is a msgr2.1 issue.
Ilya Dryomov
05:48 PM Bug #46225 (Triaged): Health check failed: 1 osds down (OSD_DOWN)
Neha Ojha
05:39 PM Bug #46225: Health check failed: 1 osds down (OSD_DOWN)
Also, related to https://tracker.ceph.com/issues/46180... Neha Ojha
10:57 AM Bug #46225 (Duplicate): Health check failed: 1 osds down (OSD_DOWN)
/a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/5176410
2020-06-2...
Sridhar Seshasayee
05:34 PM Bug #46227 (Duplicate): Segmentation fault when running ceph_test_keyvaluedb command as part of a...
Duplicate of https://tracker.ceph.com/issues/46054 Neha Ojha
11:19 AM Bug #46227 (Duplicate): Segmentation fault when running ceph_test_keyvaluedb command as part of a...
/a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/5176446
Unfortunate...
Sridhar Seshasayee
05:31 PM Bug #46179 (Triaged): Health check failed: Reduced data availability: PG_AVAILABILITY
Neha Ojha
05:11 PM Bug #46179: Health check failed: Reduced data availability: PG_AVAILABILITY
This failure is different from the one seen in the RGW suite earlier due to upmap. This is related to https://tracker... Neha Ojha
07:32 AM Bug #46179: Health check failed: Reduced data availability: PG_AVAILABILITY
/a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/
job ID: 5176200
F...
Sridhar Seshasayee
05:31 PM Bug #46224 (Fix Under Review): Health check failed: 4 mgr modules have failed (MGR_MODULE_ERROR)
Neha Ojha
10:44 AM Bug #46224 (Resolved): Health check failed: 4 mgr modules have failed (MGR_MODULE_ERROR)
/a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/5176341 and
/a/ssesha...
Sridhar Seshasayee
05:30 PM Bug #46222 (In Progress): Cbt installation task for cosbench fails.
Neha Ojha
09:03 AM Bug #46222: Cbt installation task for cosbench fails.
See /a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/5176322 as well Sridhar Seshasayee
09:00 AM Bug #46222 (Won't Fix): Cbt installation task for cosbench fails.
/a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/5176309
2020-06-2...
Sridhar Seshasayee
04:48 PM Feature #46238 (New): raise a HEALTH warn, if OSDs use the cluster_network for the front
Related to: https://tracker.ceph.com/issues/46230 Michal Nasiadka
12:17 PM Backport #46229 (In Progress): octopus: Ceph Monitor heartbeat grace period does not reset.
Sridhar Seshasayee
12:14 PM Backport #46229 (New): octopus: Ceph Monitor heartbeat grace period does not reset.
Sridhar Seshasayee
11:48 AM Backport #46229 (Resolved): octopus: Ceph Monitor heartbeat grace period does not reset.
https://github.com/ceph/ceph/pull/35799 Sridhar Seshasayee
12:13 PM Backport #46228 (In Progress): nautilus: Ceph Monitor heartbeat grace period does not reset.
Sridhar Seshasayee
12:13 PM Backport #46228 (New): nautilus: Ceph Monitor heartbeat grace period does not reset.
Sridhar Seshasayee
11:47 AM Backport #46228 (Resolved): nautilus: Ceph Monitor heartbeat grace period does not reset.
https://github.com/ceph/ceph/pull/35798 Sridhar Seshasayee
11:43 AM Bug #45943 (Pending Backport): Ceph Monitor heartbeat grace period does not reset.
Sridhar Seshasayee
11:14 AM Documentation #46203 (Resolved): docs.ceph.com is down
docs.ceph.com returned four hours later. Zac Dover
08:40 AM Bug #24057: cbt fails to copy results to the archive dir
Observed the issue during this run:
/a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distr...
Sridhar Seshasayee
07:28 AM Bug #44595: cache tiering: Error: oid 48 copy_from 493 returned error code -2
/a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/
job ID: 5176184
...
Sridhar Seshasayee
07:18 AM Bug #45441: rados: Health check failed: 1/3 mons down, quorum a,c (MON_DOWN)" in cluster log'
Observing the issue during this run:
/a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-dist...
Sridhar Seshasayee
04:38 AM Bug #46125: ceph mon memory increasing
I will try with default settings for the monitor. With current config file parameters, the monitor is using 1GB.
I...
Ashish Nagar

06/25/2020

11:56 PM Bug #46216 (Resolved): mon: log entry with garbage generated by bad memory access
Causes the mgr to segmentation fault:... Patrick Donnelly
10:27 PM Bug #46178: slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+known_if_redirec...
/a/yuvalif-2020-06-23_14:40:15-rgw-wip-yuval-test-35331-35155-distro-basic-smithi/5173465
Seems very likely to hav...
Neha Ojha
09:09 PM Bug #46125 (Need More Info): ceph mon memory increasing
Can you try with the default settings for the monitor? What level of memory usage are you seeing exactly?
There is...
Josh Durgin
07:40 PM Bug #46180: qa: Scrubbing terminated -- not all pgs were active and clean.
The common thing in all of these is that the tests are all failing while running the ceph task, no thrashing or anyth... Neha Ojha
03:15 PM Bug #46180: qa: Scrubbing terminated -- not all pgs were active and clean.
Saw the same error during this run:
http://pulpito.ceph.com/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-...
Sridhar Seshasayee
05:29 PM Bug #46211 (Duplicate): qa: pools stuck in creating
Patrick Donnelly
05:26 PM Bug #46211 (Duplicate): qa: pools stuck in creating
During cluster setup for the CephFS suites, we see this failure:... Patrick Donnelly
03:44 PM Bug #39039: mon connection reset, command not resent
Hitting this issue on octopus, Fedora 32:... Sunny Kumar
02:18 PM Documentation #46203 (In Progress): docs.ceph.com is down
I'm afraid this is outside my control. We're at the mercy of our cloud provider. Pretty sure it's this: http://trav... David Galloway
07:49 AM Documentation #46203 (Resolved): docs.ceph.com is down
docs.ceph.com has been down since at the latest 1735 aest 25 Jun 2020.
https://downforeveryoneorjustme.com/docs.ce...
Zac Dover

06/24/2020

02:16 PM Bug #46180 (Resolved): qa: Scrubbing terminated -- not all pgs were active and clean.
Seeing several test failures in the rgw suite:... Casey Bodley
02:09 PM Bug #46179 (Duplicate): Health check failed: Reduced data availability: PG_AVAILABILITY
multiple RGW tests are failing on different branches, with:... Casey Bodley
01:20 PM Bug #46178: slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+known_if_redirec...
http://pulpito.ceph.com/swagner-2020-06-24_11:30:44-rados:cephadm-wip-swagner3-testing-2020-06-24-1025-distro-basic-s... Sebastian Wagner
01:19 PM Bug #46178: slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+known_if_redirec...
http://pulpito.ceph.com/swagner-2020-06-24_11:30:44-rados:cephadm-wip-swagner3-testing-2020-06-24-1025-distro-basic-s... Sebastian Wagner
01:16 PM Bug #46178: slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+known_if_redirec...
http://pulpito.ceph.com/swagner-2020-06-24_11:30:44-rados:cephadm-wip-swagner3-testing-2020-06-24-1025-distro-basic-s... Sebastian Wagner
12:57 PM Bug #46178 (Duplicate): slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+know...
Saw this error yesterday for the first time:
http://pulpito.ceph.com/swagner-2020-06-23_13:15:09-rados:cephadm-wip...
Sebastian Wagner
10:37 AM Backport #45676 (Resolved): octopus: rados/test_envlibrados_for_rocksdb.sh fails on Xenial (seen ...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35236
m...
Nathan Cutler
02:47 AM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
... Kefu Chai
01:36 AM Backport #46164 (In Progress): nautilus: osd: make message cap option usable again
Neha Ojha
01:13 AM Backport #46164 (Resolved): nautilus: osd: make message cap option usable again
https://github.com/ceph/ceph/pull/35738 Neha Ojha
01:28 AM Backport #46165 (In Progress): octopus: osd: make message cap option usable again
Neha Ojha
01:13 AM Backport #46165 (Resolved): octopus: osd: make message cap option usable again
https://github.com/ceph/ceph/pull/35737 Neha Ojha
12:18 AM Bug #46143 (Pending Backport): osd: make message cap option usable again
Neha Ojha

06/23/2020

08:11 PM Backport #45676: octopus: rados/test_envlibrados_for_rocksdb.sh fails on Xenial (seen in nautilus)
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35236
merged
Yuri Weinstein
12:15 AM Bug #45944: osd/osd-markdown.sh: TEST_osd_stop failed
... Neha Ojha

06/22/2020

09:59 PM Backport #46115 (In Progress): octopus: Add statfs output to ceph-objectstore-tool
David Zafman
09:37 PM Backport #46116 (In Progress): nautilus: Add statfs output to ceph-objectstore-tool
David Zafman
06:52 PM Bug #45944: osd/osd-markdown.sh: TEST_osd_stop failed
/a/teuthology-2020-06-19_07:01:02-rados-master-distro-basic-smithi/5164221 Neha Ojha
05:53 PM Bug #46143 (Fix Under Review): osd: make message cap option usable again
Neha Ojha
05:36 PM Bug #46143 (In Progress): osd: make message cap option usable again
Neha Ojha
05:18 PM Bug #46143 (Resolved): osd: make message cap option usable again
"This reverts commit 45d5ac3.
Without a msg throttler, we can't change osd_client_message_cap cap
online. The thr...
Neha Ojha
04:57 PM Bug #41154: osd: pg unknown state
I again have this problem.... Alexander Kazansky
03:19 PM Documentation #46141 (New): Document automatic OSD deployment behavior better
Make certain that the documentation notifies readers that OSDs are automatically created, so that they are not caught... Zac Dover
09:12 AM Bug #46137: Monitor leader is marking multiple osd's down
Every few mins multiple osd's are going down and coming back up which is causing recovery of data, This is occurring ... Prayank Saxena
09:07 AM Bug #46137 (New): Monitor leader is marking multiple osd's down
My ceph cluster consist of 5 Mon and 58 DN with 1302 total osd's (HDD's) with 12.2.8 Luminous (stable) version and Fi... Prayank Saxena
06:02 AM Bug #45943: Ceph Monitor heartbeat grace period does not reset.
Updates from testing the fix:
OSD failure before being marked down:...
Sridhar Seshasayee

06/21/2020

02:17 PM Feature #24099: osd: Improve workflow when creating OSD on raw block device if there was bluestor...
John Spray wrote:
> This seems like an odd idea -- if someone is doing OSD creation by hand, why would they want to ...
Niklas Hambuechen
12:25 PM Documentation #46099: document statfs operation for ceph-objectstore-tool

if (op == "statfs") {
store_statfs_t statsbuf;
ret = fs->statfs(&statsbuf);
if (ret < 0) {
...
Zac Dover
12:10 PM Documentation #46126 (New): RGW docs lack an explanation of how permissions management works, esp...
<dirtwash> you know its sshitty protocol and design if obvious things arent visible and default behavior doesnt work
...
Zac Dover
08:02 AM Bug #46125: ceph mon memory increasing
Hi,
I have deployed ceph single node cluster.
ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) na...
Ashish Nagar
07:13 AM Bug #46125 (Need More Info): ceph mon memory increasing
Hi,
I have deployed ceph single node cluster.
ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) ...
Ashish Nagar

06/20/2020

10:12 PM Backport #46096 (In Progress): nautilus: Issue health status warning if num_shards_repaired excee...
Nathan Cutler
10:09 PM Backport #46095 (In Progress): octopus: Issue health status warning if num_shards_repaired exceed...
Nathan Cutler
09:57 PM Bug #45793 (Resolved): Objecter: don't attempt to read from non-primary on EC pools
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
09:56 PM Backport #45882 (Resolved): octopus: Objecter: don't attempt to read from non-primary on EC pools
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35444
m...
Nathan Cutler
09:56 PM Backport #45775 (Resolved): octopus: build_incremental_map_msg missing incremental map while snap...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35442
m...
Nathan Cutler
07:59 AM Documentation #46120 (Resolved): Improve ceph-objectstore-tool documentation
https://github.com/ceph/ceph/pull/33823
There are a number of comments by David Zafman that I failed to include in...
Zac Dover
04:20 AM Bug #46065 (Resolved): sudo missing from command in monitor-bootstrapping procedure
Zac Dover

06/19/2020

04:36 PM Backport #46116 (Resolved): nautilus: Add statfs output to ceph-objectstore-tool
https://github.com/ceph/ceph/pull/35713 Nathan Cutler
04:36 PM Backport #46115 (Resolved): octopus: Add statfs output to ceph-objectstore-tool
https://github.com/ceph/ceph/pull/35715 Nathan Cutler
05:00 AM Documentation #46099 (New): document statfs operation for ceph-objectstore-tool
https://github.com/ceph/ceph/pull/35632
https://github.com/ceph/ceph/pull/33823
The affected file (I think) is ...
Zac Dover

06/18/2020

11:26 PM Bug #46064 (Pending Backport): Add statfs output to ceph-objectstore-tool
David Zafman
01:13 AM Bug #46064 (Fix Under Review): Add statfs output to ceph-objectstore-tool
David Zafman
01:08 AM Bug #46064 (In Progress): Add statfs output to ceph-objectstore-tool
David Zafman
01:07 AM Bug #46064 (Resolved): Add statfs output to ceph-objectstore-tool

This will help diagnose out of space crashes:...
David Zafman
10:32 PM Backport #45882: octopus: Objecter: don't attempt to read from non-primary on EC pools
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35444
merged
Yuri Weinstein
10:31 PM Backport #45775: octopus: build_incremental_map_msg missing incremental map while snaptrim or bac...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35442
merged
Yuri Weinstein
08:08 PM Backport #46096 (Resolved): nautilus: Issue health status warning if num_shards_repaired exceeds ...
https://github.com/ceph/ceph/pull/36379 Patrick Donnelly
08:08 PM Backport #46095 (Resolved): octopus: Issue health status warning if num_shards_repaired exceeds s...
https://github.com/ceph/ceph/pull/35685 Patrick Donnelly
08:06 PM Backport #46090 (Resolved): nautilus: PG merge: FAILED ceph_assert(info.history.same_interval_sin...
https://github.com/ceph/ceph/pull/36161 Patrick Donnelly
08:06 PM Backport #46089 (Resolved): octopus: PG merge: FAILED ceph_assert(info.history.same_interval_sinc...
https://github.com/ceph/ceph/pull/36033 Patrick Donnelly
08:06 PM Backport #46086 (Resolved): octopus: osd: wakeup all threads of shard rather than one thread
https://github.com/ceph/ceph/pull/36032 Patrick Donnelly
10:40 AM Bug #46071 (New): potential rocksdb failure: few osd's service not starting up after node reboot....
Data node went down abruptly due to issue with SPS-BD Smart Array PCIe SAS Expander, once hardware was changed node c... Prayank Saxena
03:30 AM Bug #46065 (Fix Under Review): sudo missing from command in monitor-bootstrapping procedure
https://github.com/ceph/ceph/pull/35635 Zac Dover
03:25 AM Bug #46065 (Resolved): sudo missing from command in monitor-bootstrapping procedure
Where:
https://docs.ceph.com/docs/master/install/manual-deployment/#monitor-bootstrapping
What:
<badone> https:/...
Zac Dover

06/17/2020

09:21 PM Bug #45991 (Pending Backport): PG merge: FAILED ceph_assert(info.history.same_interval_since != 0)
Neha Ojha
11:42 AM Bug #45991 (Fix Under Review): PG merge: FAILED ceph_assert(info.history.same_interval_since != 0)
Kefu Chai
09:19 PM Bug #46024 (Fix Under Review): larger osd_scrub_max_preemptions values cause Floating point excep...
Neha Ojha
09:19 PM Bug #46043 (Need More Info): osd/ECBackend.cc: 1551: FAILED assert(!(*m).is_missing(hoid))
It is really hard to say what caused this assert without enough debug logging and I doubt we will able to reproduce t... Neha Ojha
07:41 AM Bug #46043 (Need More Info): osd/ECBackend.cc: 1551: FAILED assert(!(*m).is_missing(hoid))
We observed this crush on on one of the customer servers:... Mykola Golub
05:22 PM Feature #41564 (Pending Backport): Issue health status warning if num_shards_repaired exceeds som...
David Zafman
03:35 PM Bug #46053 (Resolved): osd: wakeup all threads of shard rather than one thread
Neha Ojha

06/16/2020

01:52 AM Bug #46024 (Resolved): larger osd_scrub_max_preemptions values cause Floating point exception

A non-default large osd_scrub_max_preemptions value (e.g., 32) would cause scrubber.preempt_divisor underflow and...
xie xingguo

06/15/2020

07:25 PM Backport #46018 (Resolved): octopus: ceph_test_rados_watch_notify hang
Nathan Cutler
07:25 PM Backport #46017 (Resolved): nautilus: ceph_test_rados_watch_notify hang
https://github.com/ceph/ceph/pull/36031 Nathan Cutler
07:24 PM Backport #46016 (Resolved): octopus: osd-backfill-stats.sh failing intermittently in TEST_backfil...
https://github.com/ceph/ceph/pull/36030 Nathan Cutler
07:22 PM Bug #45612 (Resolved): qa: powercycle: install task runs twice with double unwind causing fatal e...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
07:21 PM Backport #46007 (Resolved): octopus: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recover...
https://github.com/ceph/ceph/pull/36029 Nathan Cutler

06/13/2020

05:26 AM Bug #45991 (Resolved): PG merge: FAILED ceph_assert(info.history.same_interval_since != 0)
http://qa-proxy.ceph.com/teuthology/xxg-2020-06-13_00:34:59-rados:thrash-wip-nautilus-nnnn-distro-basic-smithi/514318... xie xingguo

06/12/2020

02:50 PM Backport #45884: octopus: osd-scrub-repair.sh: SyntaxError: invalid syntax
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35445
m...
Nathan Cutler
03:50 AM Backport #45884 (Resolved): octopus: osd-scrub-repair.sh: SyntaxError: invalid syntax
Brad Hubbard
12:31 AM Backport #45884: octopus: osd-scrub-repair.sh: SyntaxError: invalid syntax
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35445
merged
Yuri Weinstein
02:50 PM Backport #45779: octopus: rados/test_envlibrados_for_rocksdb.sh build failure (seen in nautilus)
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35443
m...
Nathan Cutler
03:47 AM Backport #45779 (Resolved): octopus: rados/test_envlibrados_for_rocksdb.sh build failure (seen in...
Brad Hubbard
12:30 AM Backport #45779: octopus: rados/test_envlibrados_for_rocksdb.sh build failure (seen in nautilus)
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35443
merged
Yuri Weinstein
02:49 PM Backport #45673 (Resolved): octopus: qa: powercycle: install task runs twice with double unwind c...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35441
m...
Nathan Cutler
12:30 AM Backport #45673: octopus: qa: powercycle: install task runs twice with double unwind causing fata...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35441
merged
Yuri Weinstein
09:33 AM Documentation #45988: [doc/os]: Centos 8 is not listed even though it is supported
I confirm that there is a row in this table that mentions Centos 8, and that this line appears when I build the docs ... Zac Dover
09:24 AM Documentation #45988 (Resolved): [doc/os]: Centos 8 is not listed even though it is supported
19
https://docs.ceph.com/docs/master/releases/octopus/
https://docs.ceph.com/docs/octopus/start/os-recommendations/...
Zac Dover

06/11/2020

05:23 PM Backport #45780: nautilus: rados/test_envlibrados_for_rocksdb.sh build failure (seen in nautilus)
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35387
m...
Nathan Cutler
01:26 AM Bug #45795 (Pending Backport): PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_back...
Kefu Chai
01:21 AM Bug #38357: ClsLock.TestExclusiveEphemeralStealEphemeral failed
... Kefu Chai

06/10/2020

09:30 PM Bug #45916 (Fix Under Review): cls_lock: unlimited shared lock created by libradosstriper api let...
Neha Ojha
09:25 PM Bug #43861 (Pending Backport): ceph_test_rados_watch_notify hang
Let's remove these tests from the stable branches too. Josh Durgin
09:02 AM Feature #41564 (In Progress): Issue health status warning if num_shards_repaired exceeds some thr...
David Zafman
12:25 AM Bug #44314 (Pending Backport): osd-backfill-stats.sh failing intermittently in TEST_backfill_size...
David Zafman

06/09/2020

09:34 PM Backport #45780 (Resolved): nautilus: rados/test_envlibrados_for_rocksdb.sh build failure (seen i...
Brad Hubbard
02:58 PM Backport #45780: nautilus: rados/test_envlibrados_for_rocksdb.sh build failure (seen in nautilus)
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35387
merged
Yuri Weinstein
09:02 PM Bug #42716: Pool creation error message is hidden on FileStore-backed pools
That wasn't the initial issue reported.
What happen if you run "ceph osd pool create foo2 2048" instead ? (assumin...
Dimitri Savineau
07:38 PM Bug #42716 (Resolved): Pool creation error message is hidden on FileStore-backed pools
closing this as already resolved.... Deepika Upadhyay
02:41 PM Bug #36337: OSDs crash with failed assertion in PGLog::merge_log as logs do not overlap
... Neha Ojha
02:41 PM Bug #45956 (New): verify takes forever to finish
rados/verify/{centos_latest.yaml ceph.yaml clusters/{fixed-2.yaml openstack.yaml} d-thrash/default/{default.yaml thra... Kefu Chai
12:24 PM Bug #45661 (Resolved): valgrind issue: UninitValue in ProtocolV2
In @master@ the PR #35407 has been closed in favor of https://github.com/ceph/ceph/pull/35186.
#35407 still might be...
Radoslaw Zarzynski
06:34 AM Bug #45948 (Duplicate): ceph_test_rados_delete_pools_parallel failed with error -2 on nautilus
Oops, this is a dup of #43887 Brad Hubbard
06:31 AM Bug #45948 (Duplicate): ceph_test_rados_delete_pools_parallel failed with error -2 on nautilus
/a/yuriw-2020-06-08_16:06:08-rados-wip-yuri2-testing-2020-06-08-1458-nautilus-distro-basic-smithi/5129541... Brad Hubbard
06:06 AM Bug #45947: ceph_test_rados_watch_notify hang seen in nautilus
Note https://tracker.ceph.com/issues/43861 removed this test from master because it was hanging. Brad Hubbard
06:02 AM Bug #45947: ceph_test_rados_watch_notify hang seen in nautilus
This is very similar to what is seen in #45946 so they may be related. Brad Hubbard
06:01 AM Bug #45947 (New): ceph_test_rados_watch_notify hang seen in nautilus
/a/yuriw-2020-06-08_16:06:08-rados-wip-yuri2-testing-2020-06-08-1458-nautilus-distro-basic-smithi/5129565... Brad Hubbard
05:32 AM Bug #45946 (New): ceph_test_rados_delete_pools_parallel hang seen in octopus
/a/yuriw-2020-05-29_15:51:00-rados-wip-yuri-testing-2020-05-28-2238-octopus-distro-basic-smithi/5103106... Brad Hubbard
04:28 AM Bug #20960: ceph_test_rados: mismatched version (due to pg import/export)
... Kefu Chai
12:05 AM Bug #44510: osd/osd-recovery-space.sh TEST_recovery_test_simple failure
Seen again:
http://pulpito.ceph.com/dzafman-2020-06-08_11:45:40-rados-wip-zafman-testing-distro-basic-smithi/5130114
David Zafman

06/08/2020

11:51 PM Bug #43888: osd/osd-bench.sh 'tell osd.N bench' hang
Saw this in at least 17 jobs:
http://pulpito.ceph.com/dzafman-2020-06-08_11:45:40-rados-wip-zafman-testing-distro-...
David Zafman
11:39 PM Bug #45944 (Triaged): osd/osd-markdown.sh: TEST_osd_stop failed
This appears to be a rare condition when 15 seconds sleep was not enough. Neha Ojha
09:14 PM Bug #45944 (Triaged): osd/osd-markdown.sh: TEST_osd_stop failed
... Neha Ojha
09:10 PM Bug #45318: Health check failed: 2/6 mons down, quorum b,a,c,e (MON_DOWN)" in cluster log running...
rados/multimon/{clusters/21 msgr-failures/few msgr/async-v1only no_pools objectstore/bluestore-comp-zlib rados suppor... Neha Ojha
07:39 PM Bug #45943 (Fix Under Review): Ceph Monitor heartbeat grace period does not reset.
Sridhar Seshasayee
07:09 PM Bug #45943 (Resolved): Ceph Monitor heartbeat grace period does not reset.
The heartbeat grace timer does not reset after cluster network is stable for multiple days.
Implement a mechanism to...
Sridhar Seshasayee
06:31 PM Backport #45891 (In Progress): luminous: osd: pg stuck in waitactingchange when new acting set do...
Nathan Cutler
06:22 PM Backport #45892 (In Progress): mimic: osd: pg stuck in waitactingchange when new acting set doesn...
Nathan Cutler
12:51 PM Bug #45795 (Fix Under Review): PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_back...
Ilya Dryomov
07:01 AM Bug #45916: cls_lock: unlimited shared lock created by libradosstriper api let node crash
add pr: https://github.com/ceph/ceph/pull/35467 Zhenyi Shu
06:50 AM Bug #45916 (Fix Under Review): cls_lock: unlimited shared lock created by libradosstriper api let...
_Background: Ceph liminous are running on our production and a service uses libradosstriper api to access ceph._
W...
Zhenyi Shu

06/06/2020

08:45 AM Backport #45357 (Resolved): octopus: rados: Sharded OpWQ drops suicide_grace after waiting for work
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/34881
m...
Nathan Cutler
08:31 AM Backport #45884 (In Progress): octopus: osd-scrub-repair.sh: SyntaxError: invalid syntax
Nathan Cutler
08:31 AM Backport #45882 (In Progress): octopus: Objecter: don't attempt to read from non-primary on EC pools
Nathan Cutler
08:30 AM Backport #45779 (In Progress): octopus: rados/test_envlibrados_for_rocksdb.sh build failure (seen...
Nathan Cutler
08:29 AM Backport #45775 (In Progress): octopus: build_incremental_map_msg missing incremental map while s...
Nathan Cutler
08:28 AM Backport #45673 (In Progress): octopus: qa: powercycle: install task runs twice with double unwin...
Nathan Cutler
12:53 AM Bug #44314 (In Progress): osd-backfill-stats.sh failing intermittently in TEST_backfill_sizeup_ou...
David Zafman

06/05/2020

10:52 PM Bug #44314: osd-backfill-stats.sh failing intermittently in TEST_backfill_sizeup_out() (degraded ...

It would be helpful to see the osd logs when this happens. We are expecting the following sequence to occur.
St...
David Zafman
04:20 PM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
/a/yuriw-2020-06-04_18:03:48-rados-wip-yuri2-testing-2020-06-03-2341-MASTER-distro-basic-smithi/5117777 Neha Ojha
04:17 PM Bug #45424: api_watch_notify_pp: [ FAILED ] LibRadosWatchNotifyECPP.WatchNotify watch_notify_cx...
/a/yuriw-2020-06-04_18:03:48-rados-wip-yuri2-testing-2020-06-03-2341-MASTER-distro-basic-smithi/5117783 Neha Ojha
04:01 PM Bug #20960: ceph_test_rados: mismatched version (due to pg import/export)
/a/yuriw-2020-06-04_18:03:48-rados-wip-yuri2-testing-2020-06-03-2341-MASTER-distro-basic-smithi/5118028 Neha Ojha
03:58 PM Bug #44517: osd/osd-backfill-space.sh TEST_backfill_multi_partial: pgs didn't go active+clean
... Neha Ojha

06/04/2020

09:15 PM Bug #45868: rados_api_tests: LibRadosWatchNotify.AioWatchNotify2 fails
Similar... Neha Ojha
09:06 PM Bug #45661 (Fix Under Review): valgrind issue: UninitValue in ProtocolV2
https://github.com/ceph/ceph/pull/35407 Radoslaw Zarzynski
10:07 AM Bug #45661: valgrind issue: UninitValue in ProtocolV2
Pin-pointed to a branch of @PrimaryLogPG::do_manifest_flush()@:... Radoslaw Zarzynski
08:36 AM Bug #45661: valgrind issue: UninitValue in ProtocolV2
... Radoslaw Zarzynski
06:08 PM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
Ah, that makes sense. It should suffice to simply not populate_obc_watchers if replica. Samuel Just
05:42 PM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
After more digging, this doesn't appear to be related to notifies being sent to replicas.
The issue seems to be wi...
Ilya Dryomov
12:48 PM Backport #45890 (In Progress): nautilus: osd: pg stuck in waitactingchange when new acting set do...
Nathan Cutler
11:58 AM Backport #45890 (Resolved): nautilus: osd: pg stuck in waitactingchange when new acting set doesn...
https://github.com/ceph/ceph/pull/35389 Nathan Cutler
12:44 PM Backport #45883 (In Progress): nautilus: osd-scrub-repair.sh: SyntaxError: invalid syntax
Nathan Cutler
11:55 AM Backport #45883 (Resolved): nautilus: osd-scrub-repair.sh: SyntaxError: invalid syntax
https://github.com/ceph/ceph/pull/35388 Nathan Cutler
12:44 PM Backport #45780 (In Progress): nautilus: rados/test_envlibrados_for_rocksdb.sh build failure (see...
Nathan Cutler
12:43 PM Backport #45776 (In Progress): nautilus: build_incremental_map_msg missing incremental map while ...
Nathan Cutler
11:59 AM Backport #45892 (Rejected): mimic: osd: pg stuck in waitactingchange when new acting set doesn't ...
https://github.com/ceph/ceph/pull/35484 Nathan Cutler
11:59 AM Backport #45891 (Rejected): luminous: osd: pg stuck in waitactingchange when new acting set doesn...
https://github.com/ceph/ceph/pull/35485 Nathan Cutler
11:55 AM Backport #45884 (Resolved): octopus: osd-scrub-repair.sh: SyntaxError: invalid syntax
https://github.com/ceph/ceph/pull/35445 Nathan Cutler
11:55 AM Backport #45882 (Resolved): octopus: Objecter: don't attempt to read from non-primary on EC pools
https://github.com/ceph/ceph/pull/35444 Nathan Cutler
07:16 AM Bug #45871 (New): Incorrect (0) number of slow requests in health check
ceph version 14.2.9-899-gc02349c600 (c02349c60052aaa6c7bd0c2270c7f7be16fab632) nautilus (stable)
Our cluster shows...
Eugen Block
12:24 AM Bug #40117 (Duplicate): PG stuck in WaitActingChange
Fixed in https://tracker.ceph.com/issues/41190 Neha Ojha
12:21 AM Bug #41190 (Pending Backport): osd: pg stuck in waitactingchange when new acting set doesn't change
Neha Ojha
12:20 AM Bug #41236 (Resolved): cosbench failures in rados/perf
Neha Ojha
12:18 AM Bug #41550 (Resolved): os/bluestore: fadvise_flag leak in generate_transaction
Neha Ojha
12:17 AM Bug #41677 (Resolved): Cephmon:fix mon crash
Fixed as a part of https://tracker.ceph.com/issues/41680. Neha Ojha
12:14 AM Bug #41913 (Resolved): With auto scaler operating stopping an OSD can lead to COT crashing instea...
Neha Ojha
12:08 AM Bug #45356 (Resolved): nautilus: rados/upgrade/mimic-x-singleton failures due to mon_client_direc...
Neha Ojha

06/03/2020

09:06 PM Bug #45733 (Pending Backport): osd-scrub-repair.sh: SyntaxError: invalid syntax
Neha Ojha
06:12 PM Bug #45733: osd-scrub-repair.sh: SyntaxError: invalid syntax
https://github.com/ceph/ceph/pull/35279 merged Yuri Weinstein
08:50 PM Backport #45357: octopus: rados: Sharded OpWQ drops suicide_grace after waiting for work
Dan Hill wrote:
> https://github.com/ceph/ceph/pull/34881
merged
Yuri Weinstein
08:34 PM Bug #45868 (Resolved): rados_api_tests: LibRadosWatchNotify.AioWatchNotify2 fails
... Neha Ojha
08:30 PM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
/a/yuriw-2020-06-02_15:07:59-rados-wip-yuri7-testing-2020-06-01-2256-octopus-distro-basic-smithi/5113082 - octopus Neha Ojha
04:44 AM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
Moving this since it appears to be a problem with the mon_thrasher (or the MONs or monclients).... Brad Hubbard
02:44 PM Bug #45793 (Pending Backport): Objecter: don't attempt to read from non-primary on EC pools
Kefu Chai
01:24 PM Backport #41533: mimic: Move bluefs alloc size initialization log message to log level 1
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30219
m...
Nathan Cutler
12:59 PM Bug #45857 (New): crimson/alien_store: alienstore cannot open_collections
setup: setting debug level 20 for bluestore, filestore and osd and using seastar with seastar_default_allocator + Rel... Deepika Upadhyay
01:50 AM Bug #9984: lttng_probe_unregister hangs on shutdown
/a/yuriw-2020-05-30_02:18:17-rados-wip-yuri-master_5.29.20-distro-basic-smithi/5104372
Possibly an instance of thi...
Brad Hubbard

06/02/2020

07:14 PM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
I see. Watch being a write and notify being a read has always tripped me, but I guess I looked at it from the side e... Ilya Dryomov
03:28 PM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
Well, osd-side notifies are reads in that they don't result in mutation. I think lingerops in general probably shoul... Samuel Just
10:38 AM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
Samuel Just wrote:
> Did that fire on the replica? At a guess, the problem is that notifies are being sent to repli...
Ilya Dryomov
02:07 AM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
It probably isn't https://tracker.ceph.com/issues/15391. Samuel Just
02:05 AM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
Did that fire on the replica? At a guess, the problem is that notifies are being sent to replicas, which would be wr... Samuel Just
07:08 PM Bug #45802 (Resolved): Health check failed: Reduced data availability: PG_AVAILABILITY
Casey Bodley
06:19 PM Bug #45802 (Fix Under Review): Health check failed: Reduced data availability: PG_AVAILABILITY
Neha Ojha
06:17 PM Bug #45802 (Triaged): Health check failed: Reduced data availability: PG_AVAILABILITY
Same root cause as https://tracker.ceph.com/issues/45619.
http://pulpito.ceph.com/teuthology-2020-05-30_03:05:02...
Neha Ojha
07:16 AM Bug #45809 (New): When out a osd, the `MAX AVAIL` doesn't change.
Environment: Luminous 12.2.12
I have a question about the pool's `MAX AVAIL` of `ceph df`.
When i out a osd, th...
chao wang
06:00 AM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
/a/yuriw-2020-05-30_02:18:17-rados-wip-yuri-master_5.29.20-distro-basic-smithi/5104057 Brad Hubbard
05:13 AM Bug #45661: valgrind issue: UninitValue in ProtocolV2
/a/yuriw-2020-05-30_02:18:17-rados-wip-yuri-master_5.29.20-distro-basic-smithi/5103952
/a/yuriw-2020-05-30_02:18:17-...
Brad Hubbard

06/01/2020

03:21 PM Bug #45802 (Resolved): Health check failed: Reduced data availability: PG_AVAILABILITY
multiple RGW tests are failing on different branches, with:... Casey Bodley
12:13 AM Bug #45796 (New): Ceph mon's sporadically report slow ops
We have recently upgraded our cluster to 14.2.9 from 10.2.6 and are in the process of a rolling rebuild of many of th... David Hows

05/31/2020

01:20 PM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
Sam, could you please take a look? Ilya Dryomov
01:19 PM Bug #45795 (Resolved): PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().e...
I'm running into this assert while trying to exercise krbd with replica reads (particularly balanced reads):... Ilya Dryomov
12:34 PM Bug #45793: Objecter: don't attempt to read from non-primary on EC pools
Marking only for octopus, since replica reads are safe for general use only in octopus. Ilya Dryomov
12:32 PM Bug #45793 (Fix Under Review): Objecter: don't attempt to read from non-primary on EC pools
Ilya Dryomov
12:25 PM Bug #45793 (Resolved): Objecter: don't attempt to read from non-primary on EC pools
Ilya Dryomov

05/29/2020

05:31 PM Backport #45781 (Rejected): mimic: rados/test_envlibrados_for_rocksdb.sh build failure (seen in n...
Nathan Cutler
05:31 PM Backport #45780 (Resolved): nautilus: rados/test_envlibrados_for_rocksdb.sh build failure (seen i...
https://github.com/ceph/ceph/pull/35387 Nathan Cutler
05:31 PM Backport #45779 (Resolved): octopus: rados/test_envlibrados_for_rocksdb.sh build failure (seen in...
https://github.com/ceph/ceph/pull/35443 Nathan Cutler
05:30 PM Backport #45776 (Resolved): nautilus: build_incremental_map_msg missing incremental map while sna...
https://github.com/ceph/ceph/pull/35386 Nathan Cutler
05:30 PM Backport #45775 (Resolved): octopus: build_incremental_map_msg missing incremental map while snap...
https://github.com/ceph/ceph/pull/35442 Nathan Cutler
05:16 AM Bug #45761 (Need More Info): mon_thrasher: "Error ENXIO: mon unavailable" during sync_force comma...
/a/yuriw-2020-05-28_02:23:45-rados-wip-yuri-master_5.27.20-distro-basic-smithi/5097794... Brad Hubbard
04:11 AM Bug #45619 (Resolved): Health check failed: Reduced data availability: PG_AVAILABILITY
Kefu Chai
03:58 AM Bug #45760 (Resolved): osd-scrub-snaps.sh: TEST_scrub_snaps failed
Neha Ojha

05/28/2020

10:48 PM Bug #45760 (Fix Under Review): osd-scrub-snaps.sh: TEST_scrub_snaps failed
Neha Ojha
09:12 PM Bug #45760 (Resolved): osd-scrub-snaps.sh: TEST_scrub_snaps failed
... Neha Ojha
09:39 PM Bug #45660 (Resolved): osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed
Neha Ojha
12:42 AM Bug #45660 (Fix Under Review): osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed
Neha Ojha
08:57 PM Bug #45619 (Fix Under Review): Health check failed: Reduced data availability: PG_AVAILABILITY
Neha Ojha
01:52 PM Bug #41399 (Resolved): Move bluefs alloc size initialization log message to log level 1
Vikhyat Umrao
01:52 PM Backport #41533 (Resolved): mimic: Move bluefs alloc size initialization log message to log level 1
Vikhyat Umrao
07:17 AM Bug #45606 (Pending Backport): build_incremental_map_msg missing incremental map while snaptrim o...
Kefu Chai
06:38 AM Bug #44595: cache tiering: Error: oid 48 copy_from 493 returned error code -2
... Kefu Chai
06:08 AM Bug #45661: valgrind issue: UninitValue in ProtocolV2
@/a/kchai-2020-05-27_23:43:53-rados-wip-kefu-testing-2020-05-27-2242-distro-basic-smithi/5097299/remote/*/log/valgrin... Kefu Chai
02:10 AM Bug #45661: valgrind issue: UninitValue in ProtocolV2
/a/yuriw-2020-05-24_19:30:40-rados-wip-yuri-master_5.24.20-distro-basic-smithi/5088037
/a/yuriw-2020-05-24_19:30:40-...
Brad Hubbard
 

Also available in: Atom