Activity
From 05/28/2020 to 06/26/2020
06/26/2020
- 07:27 PM Bug #46178: slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+known_if_redirec...
- http://pulpito.ceph.com/mgfritch-2020-06-26_02:07:27-rados-wip-mgfritch-testing-2020-06-25-1855-distro-basic-smithi/...
- 06:21 PM Bug #46180: qa: Scrubbing terminated -- not all pgs were active and clean.
- Here's a reliable reproducer for the issue:
-s rados/singleton-nomsgr -c master --filter 'all/health-warnings rado... - 06:50 AM Bug #46180: qa: Scrubbing terminated -- not all pgs were active and clean.
- I think it has to do with reconnect handling and how connections are reused.
This part of ProtocolV2 is pretty fra... - 05:04 AM Bug #46180: qa: Scrubbing terminated -- not all pgs were active and clean.
- This is a msgr2.1 issue.
- 05:48 PM Bug #46225 (Triaged): Health check failed: 1 osds down (OSD_DOWN)
- 05:39 PM Bug #46225: Health check failed: 1 osds down (OSD_DOWN)
- Also, related to https://tracker.ceph.com/issues/46180...
- 10:57 AM Bug #46225 (Duplicate): Health check failed: 1 osds down (OSD_DOWN)
- /a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/5176410
2020-06-2... - 05:34 PM Bug #46227 (Duplicate): Segmentation fault when running ceph_test_keyvaluedb command as part of a...
- Duplicate of https://tracker.ceph.com/issues/46054
- 11:19 AM Bug #46227 (Duplicate): Segmentation fault when running ceph_test_keyvaluedb command as part of a...
- /a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/5176446
Unfortunate... - 05:31 PM Bug #46179 (Triaged): Health check failed: Reduced data availability: PG_AVAILABILITY
- 05:11 PM Bug #46179: Health check failed: Reduced data availability: PG_AVAILABILITY
- This failure is different from the one seen in the RGW suite earlier due to upmap. This is related to https://tracker...
- 07:32 AM Bug #46179: Health check failed: Reduced data availability: PG_AVAILABILITY
- /a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/
job ID: 5176200
F... - 05:31 PM Bug #46224 (Fix Under Review): Health check failed: 4 mgr modules have failed (MGR_MODULE_ERROR)
- 10:44 AM Bug #46224 (Resolved): Health check failed: 4 mgr modules have failed (MGR_MODULE_ERROR)
- /a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/5176341 and
/a/ssesha... - 05:30 PM Bug #46222 (In Progress): Cbt installation task for cosbench fails.
- 09:03 AM Bug #46222: Cbt installation task for cosbench fails.
- See /a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/5176322 as well
- 09:00 AM Bug #46222 (Won't Fix): Cbt installation task for cosbench fails.
- /a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/5176309
2020-06-2... - 04:48 PM Feature #46238 (New): raise a HEALTH warn, if OSDs use the cluster_network for the front
- Related to: https://tracker.ceph.com/issues/46230
- 12:17 PM Backport #46229 (In Progress): octopus: Ceph Monitor heartbeat grace period does not reset.
- 12:14 PM Backport #46229 (New): octopus: Ceph Monitor heartbeat grace period does not reset.
- 11:48 AM Backport #46229 (Resolved): octopus: Ceph Monitor heartbeat grace period does not reset.
- https://github.com/ceph/ceph/pull/35799
- 12:13 PM Backport #46228 (In Progress): nautilus: Ceph Monitor heartbeat grace period does not reset.
- 12:13 PM Backport #46228 (New): nautilus: Ceph Monitor heartbeat grace period does not reset.
- 11:47 AM Backport #46228 (Resolved): nautilus: Ceph Monitor heartbeat grace period does not reset.
- https://github.com/ceph/ceph/pull/35798
- 11:43 AM Bug #45943 (Pending Backport): Ceph Monitor heartbeat grace period does not reset.
- 11:14 AM Documentation #46203 (Resolved): docs.ceph.com is down
- docs.ceph.com returned four hours later.
- 08:40 AM Bug #24057: cbt fails to copy results to the archive dir
- Observed the issue during this run:
/a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distr... - 07:28 AM Bug #44595: cache tiering: Error: oid 48 copy_from 493 returned error code -2
- /a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/
job ID: 5176184
... - 07:18 AM Bug #45441: rados: Health check failed: 1/3 mons down, quorum a,c (MON_DOWN)" in cluster log'
- Observing the issue during this run:
/a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-dist... - 04:38 AM Bug #46125: ceph mon memory increasing
- I will try with default settings for the monitor. With current config file parameters, the monitor is using 1GB.
I...
06/25/2020
- 11:56 PM Bug #46216 (Resolved): mon: log entry with garbage generated by bad memory access
- Causes the mgr to segmentation fault:...
- 10:27 PM Bug #46178: slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+known_if_redirec...
- /a/yuvalif-2020-06-23_14:40:15-rgw-wip-yuval-test-35331-35155-distro-basic-smithi/5173465
Seems very likely to hav... - 09:09 PM Bug #46125 (Need More Info): ceph mon memory increasing
- Can you try with the default settings for the monitor? What level of memory usage are you seeing exactly?
There is... - 07:40 PM Bug #46180: qa: Scrubbing terminated -- not all pgs were active and clean.
- The common thing in all of these is that the tests are all failing while running the ceph task, no thrashing or anyth...
- 03:15 PM Bug #46180: qa: Scrubbing terminated -- not all pgs were active and clean.
- Saw the same error during this run:
http://pulpito.ceph.com/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-... - 05:29 PM Bug #46211 (Duplicate): qa: pools stuck in creating
- 05:26 PM Bug #46211 (Duplicate): qa: pools stuck in creating
- During cluster setup for the CephFS suites, we see this failure:...
- 03:44 PM Bug #39039: mon connection reset, command not resent
- Hitting this issue on octopus, Fedora 32:...
- 02:18 PM Documentation #46203 (In Progress): docs.ceph.com is down
- I'm afraid this is outside my control. We're at the mercy of our cloud provider. Pretty sure it's this: http://trav...
- 07:49 AM Documentation #46203 (Resolved): docs.ceph.com is down
- docs.ceph.com has been down since at the latest 1735 aest 25 Jun 2020.
https://downforeveryoneorjustme.com/docs.ce...
06/24/2020
- 02:16 PM Bug #46180 (Resolved): qa: Scrubbing terminated -- not all pgs were active and clean.
- Seeing several test failures in the rgw suite:...
- 02:09 PM Bug #46179 (Duplicate): Health check failed: Reduced data availability: PG_AVAILABILITY
- multiple RGW tests are failing on different branches, with:...
- 01:20 PM Bug #46178: slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+known_if_redirec...
- http://pulpito.ceph.com/swagner-2020-06-24_11:30:44-rados:cephadm-wip-swagner3-testing-2020-06-24-1025-distro-basic-s...
- 01:19 PM Bug #46178: slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+known_if_redirec...
- http://pulpito.ceph.com/swagner-2020-06-24_11:30:44-rados:cephadm-wip-swagner3-testing-2020-06-24-1025-distro-basic-s...
- 01:16 PM Bug #46178: slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+known_if_redirec...
- http://pulpito.ceph.com/swagner-2020-06-24_11:30:44-rados:cephadm-wip-swagner3-testing-2020-06-24-1025-distro-basic-s...
- 12:57 PM Bug #46178 (Duplicate): slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+know...
- Saw this error yesterday for the first time:
http://pulpito.ceph.com/swagner-2020-06-23_13:15:09-rados:cephadm-wip... - 10:37 AM Backport #45676 (Resolved): octopus: rados/test_envlibrados_for_rocksdb.sh fails on Xenial (seen ...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35236
m... - 02:47 AM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
- ...
- 01:36 AM Backport #46164 (In Progress): nautilus: osd: make message cap option usable again
- 01:13 AM Backport #46164 (Resolved): nautilus: osd: make message cap option usable again
- https://github.com/ceph/ceph/pull/35738
- 01:28 AM Backport #46165 (In Progress): octopus: osd: make message cap option usable again
- 01:13 AM Backport #46165 (Resolved): octopus: osd: make message cap option usable again
- https://github.com/ceph/ceph/pull/35737
- 12:18 AM Bug #46143 (Pending Backport): osd: make message cap option usable again
06/23/2020
- 08:11 PM Backport #45676: octopus: rados/test_envlibrados_for_rocksdb.sh fails on Xenial (seen in nautilus)
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35236
merged - 12:15 AM Bug #45944: osd/osd-markdown.sh: TEST_osd_stop failed
- ...
06/22/2020
- 09:59 PM Backport #46115 (In Progress): octopus: Add statfs output to ceph-objectstore-tool
- 09:37 PM Backport #46116 (In Progress): nautilus: Add statfs output to ceph-objectstore-tool
- 06:52 PM Bug #45944: osd/osd-markdown.sh: TEST_osd_stop failed
- /a/teuthology-2020-06-19_07:01:02-rados-master-distro-basic-smithi/5164221
- 05:53 PM Bug #46143 (Fix Under Review): osd: make message cap option usable again
- 05:36 PM Bug #46143 (In Progress): osd: make message cap option usable again
- 05:18 PM Bug #46143 (Resolved): osd: make message cap option usable again
- "This reverts commit 45d5ac3.
Without a msg throttler, we can't change osd_client_message_cap cap
online. The thr... - 04:57 PM Bug #41154: osd: pg unknown state
- I again have this problem....
- 03:19 PM Documentation #46141 (New): Document automatic OSD deployment behavior better
- Make certain that the documentation notifies readers that OSDs are automatically created, so that they are not caught...
- 09:12 AM Bug #46137: Monitor leader is marking multiple osd's down
- Every few mins multiple osd's are going down and coming back up which is causing recovery of data, This is occurring ...
- 09:07 AM Bug #46137 (New): Monitor leader is marking multiple osd's down
- My ceph cluster consist of 5 Mon and 58 DN with 1302 total osd's (HDD's) with 12.2.8 Luminous (stable) version and Fi...
- 06:02 AM Bug #45943: Ceph Monitor heartbeat grace period does not reset.
- Updates from testing the fix:
OSD failure before being marked down:...
06/21/2020
- 02:17 PM Feature #24099: osd: Improve workflow when creating OSD on raw block device if there was bluestor...
- John Spray wrote:
> This seems like an odd idea -- if someone is doing OSD creation by hand, why would they want to ... - 12:25 PM Documentation #46099: document statfs operation for ceph-objectstore-tool
if (op == "statfs") {
store_statfs_t statsbuf;
ret = fs->statfs(&statsbuf);
if (ret < 0) {
...- 12:10 PM Documentation #46126 (New): RGW docs lack an explanation of how permissions management works, esp...
- <dirtwash> you know its sshitty protocol and design if obvious things arent visible and default behavior doesnt work
... - 08:02 AM Bug #46125: ceph mon memory increasing
- Hi,
I have deployed ceph single node cluster.
ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) na... - 07:13 AM Bug #46125 (Need More Info): ceph mon memory increasing
- Hi,
I have deployed ceph single node cluster.
ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) ...
06/20/2020
- 10:12 PM Backport #46096 (In Progress): nautilus: Issue health status warning if num_shards_repaired excee...
- 10:09 PM Backport #46095 (In Progress): octopus: Issue health status warning if num_shards_repaired exceed...
- 09:57 PM Bug #45793 (Resolved): Objecter: don't attempt to read from non-primary on EC pools
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 09:56 PM Backport #45882 (Resolved): octopus: Objecter: don't attempt to read from non-primary on EC pools
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35444
m... - 09:56 PM Backport #45775 (Resolved): octopus: build_incremental_map_msg missing incremental map while snap...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35442
m... - 07:59 AM Documentation #46120 (Resolved): Improve ceph-objectstore-tool documentation
- https://github.com/ceph/ceph/pull/33823
There are a number of comments by David Zafman that I failed to include in... - 04:20 AM Bug #46065 (Resolved): sudo missing from command in monitor-bootstrapping procedure
06/19/2020
- 04:36 PM Backport #46116 (Resolved): nautilus: Add statfs output to ceph-objectstore-tool
- https://github.com/ceph/ceph/pull/35713
- 04:36 PM Backport #46115 (Resolved): octopus: Add statfs output to ceph-objectstore-tool
- https://github.com/ceph/ceph/pull/35715
- 05:00 AM Documentation #46099 (New): document statfs operation for ceph-objectstore-tool
- https://github.com/ceph/ceph/pull/35632
https://github.com/ceph/ceph/pull/33823
The affected file (I think) is ...
06/18/2020
- 11:26 PM Bug #46064 (Pending Backport): Add statfs output to ceph-objectstore-tool
- 01:13 AM Bug #46064 (Fix Under Review): Add statfs output to ceph-objectstore-tool
- 01:08 AM Bug #46064 (In Progress): Add statfs output to ceph-objectstore-tool
- 01:07 AM Bug #46064 (Resolved): Add statfs output to ceph-objectstore-tool
This will help diagnose out of space crashes:...- 10:32 PM Backport #45882: octopus: Objecter: don't attempt to read from non-primary on EC pools
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35444
merged - 10:31 PM Backport #45775: octopus: build_incremental_map_msg missing incremental map while snaptrim or bac...
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35442
merged - 08:08 PM Backport #46096 (Resolved): nautilus: Issue health status warning if num_shards_repaired exceeds ...
- https://github.com/ceph/ceph/pull/36379
- 08:08 PM Backport #46095 (Resolved): octopus: Issue health status warning if num_shards_repaired exceeds s...
- https://github.com/ceph/ceph/pull/35685
- 08:06 PM Backport #46090 (Resolved): nautilus: PG merge: FAILED ceph_assert(info.history.same_interval_sin...
- https://github.com/ceph/ceph/pull/36161
- 08:06 PM Backport #46089 (Resolved): octopus: PG merge: FAILED ceph_assert(info.history.same_interval_sinc...
- https://github.com/ceph/ceph/pull/36033
- 08:06 PM Backport #46086 (Resolved): octopus: osd: wakeup all threads of shard rather than one thread
- https://github.com/ceph/ceph/pull/36032
- 10:40 AM Bug #46071 (New): potential rocksdb failure: few osd's service not starting up after node reboot....
- Data node went down abruptly due to issue with SPS-BD Smart Array PCIe SAS Expander, once hardware was changed node c...
- 03:30 AM Bug #46065 (Fix Under Review): sudo missing from command in monitor-bootstrapping procedure
- https://github.com/ceph/ceph/pull/35635
- 03:25 AM Bug #46065 (Resolved): sudo missing from command in monitor-bootstrapping procedure
- Where:
https://docs.ceph.com/docs/master/install/manual-deployment/#monitor-bootstrapping
What:
<badone> https:/...
06/17/2020
- 09:21 PM Bug #45991 (Pending Backport): PG merge: FAILED ceph_assert(info.history.same_interval_since != 0)
- 11:42 AM Bug #45991 (Fix Under Review): PG merge: FAILED ceph_assert(info.history.same_interval_since != 0)
- 09:19 PM Bug #46024 (Fix Under Review): larger osd_scrub_max_preemptions values cause Floating point excep...
- 09:19 PM Bug #46043 (Need More Info): osd/ECBackend.cc: 1551: FAILED assert(!(*m).is_missing(hoid))
- It is really hard to say what caused this assert without enough debug logging and I doubt we will able to reproduce t...
- 07:41 AM Bug #46043 (Need More Info): osd/ECBackend.cc: 1551: FAILED assert(!(*m).is_missing(hoid))
- We observed this crush on on one of the customer servers:...
- 05:22 PM Feature #41564 (Pending Backport): Issue health status warning if num_shards_repaired exceeds som...
- 03:35 PM Bug #46053 (Resolved): osd: wakeup all threads of shard rather than one thread
06/16/2020
- 01:52 AM Bug #46024 (Resolved): larger osd_scrub_max_preemptions values cause Floating point exception
A non-default large osd_scrub_max_preemptions value (e.g., 32) would cause scrubber.preempt_divisor underflow and...
06/15/2020
- 07:25 PM Backport #46018 (Resolved): octopus: ceph_test_rados_watch_notify hang
- 07:25 PM Backport #46017 (Resolved): nautilus: ceph_test_rados_watch_notify hang
- https://github.com/ceph/ceph/pull/36031
- 07:24 PM Backport #46016 (Resolved): octopus: osd-backfill-stats.sh failing intermittently in TEST_backfil...
- https://github.com/ceph/ceph/pull/36030
- 07:22 PM Bug #45612 (Resolved): qa: powercycle: install task runs twice with double unwind causing fatal e...
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 07:21 PM Backport #46007 (Resolved): octopus: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recover...
- https://github.com/ceph/ceph/pull/36029
06/13/2020
- 05:26 AM Bug #45991 (Resolved): PG merge: FAILED ceph_assert(info.history.same_interval_since != 0)
- http://qa-proxy.ceph.com/teuthology/xxg-2020-06-13_00:34:59-rados:thrash-wip-nautilus-nnnn-distro-basic-smithi/514318...
06/12/2020
- 02:50 PM Backport #45884: octopus: osd-scrub-repair.sh: SyntaxError: invalid syntax
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35445
m... - 03:50 AM Backport #45884 (Resolved): octopus: osd-scrub-repair.sh: SyntaxError: invalid syntax
- 12:31 AM Backport #45884: octopus: osd-scrub-repair.sh: SyntaxError: invalid syntax
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35445
merged - 02:50 PM Backport #45779: octopus: rados/test_envlibrados_for_rocksdb.sh build failure (seen in nautilus)
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35443
m... - 03:47 AM Backport #45779 (Resolved): octopus: rados/test_envlibrados_for_rocksdb.sh build failure (seen in...
- 12:30 AM Backport #45779: octopus: rados/test_envlibrados_for_rocksdb.sh build failure (seen in nautilus)
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35443
merged - 02:49 PM Backport #45673 (Resolved): octopus: qa: powercycle: install task runs twice with double unwind c...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35441
m... - 12:30 AM Backport #45673: octopus: qa: powercycle: install task runs twice with double unwind causing fata...
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35441
merged - 09:33 AM Documentation #45988: [doc/os]: Centos 8 is not listed even though it is supported
- I confirm that there is a row in this table that mentions Centos 8, and that this line appears when I build the docs ...
- 09:24 AM Documentation #45988 (Resolved): [doc/os]: Centos 8 is not listed even though it is supported
- 19
https://docs.ceph.com/docs/master/releases/octopus/
https://docs.ceph.com/docs/octopus/start/os-recommendations/...
06/11/2020
- 05:23 PM Backport #45780: nautilus: rados/test_envlibrados_for_rocksdb.sh build failure (seen in nautilus)
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35387
m... - 01:26 AM Bug #45795 (Pending Backport): PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_back...
- 01:21 AM Bug #38357: ClsLock.TestExclusiveEphemeralStealEphemeral failed
- ...
06/10/2020
- 09:30 PM Bug #45916 (Fix Under Review): cls_lock: unlimited shared lock created by libradosstriper api let...
- 09:25 PM Bug #43861 (Pending Backport): ceph_test_rados_watch_notify hang
- Let's remove these tests from the stable branches too.
- 09:02 AM Feature #41564 (In Progress): Issue health status warning if num_shards_repaired exceeds some thr...
- 12:25 AM Bug #44314 (Pending Backport): osd-backfill-stats.sh failing intermittently in TEST_backfill_size...
06/09/2020
- 09:34 PM Backport #45780 (Resolved): nautilus: rados/test_envlibrados_for_rocksdb.sh build failure (seen i...
- 02:58 PM Backport #45780: nautilus: rados/test_envlibrados_for_rocksdb.sh build failure (seen in nautilus)
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35387
merged - 09:02 PM Bug #42716: Pool creation error message is hidden on FileStore-backed pools
- That wasn't the initial issue reported.
What happen if you run "ceph osd pool create foo2 2048" instead ? (assumin... - 07:38 PM Bug #42716 (Resolved): Pool creation error message is hidden on FileStore-backed pools
- closing this as already resolved....
- 02:41 PM Bug #36337: OSDs crash with failed assertion in PGLog::merge_log as logs do not overlap
- ...
- 02:41 PM Bug #45956 (New): verify takes forever to finish
- rados/verify/{centos_latest.yaml ceph.yaml clusters/{fixed-2.yaml openstack.yaml} d-thrash/default/{default.yaml thra...
- 12:24 PM Bug #45661 (Resolved): valgrind issue: UninitValue in ProtocolV2
- In @master@ the PR #35407 has been closed in favor of https://github.com/ceph/ceph/pull/35186.
#35407 still might be... - 06:34 AM Bug #45948 (Duplicate): ceph_test_rados_delete_pools_parallel failed with error -2 on nautilus
- Oops, this is a dup of #43887
- 06:31 AM Bug #45948 (Duplicate): ceph_test_rados_delete_pools_parallel failed with error -2 on nautilus
- /a/yuriw-2020-06-08_16:06:08-rados-wip-yuri2-testing-2020-06-08-1458-nautilus-distro-basic-smithi/5129541...
- 06:06 AM Bug #45947: ceph_test_rados_watch_notify hang seen in nautilus
- Note https://tracker.ceph.com/issues/43861 removed this test from master because it was hanging.
- 06:02 AM Bug #45947: ceph_test_rados_watch_notify hang seen in nautilus
- This is very similar to what is seen in #45946 so they may be related.
- 06:01 AM Bug #45947 (New): ceph_test_rados_watch_notify hang seen in nautilus
- /a/yuriw-2020-06-08_16:06:08-rados-wip-yuri2-testing-2020-06-08-1458-nautilus-distro-basic-smithi/5129565...
- 05:32 AM Bug #45946 (New): ceph_test_rados_delete_pools_parallel hang seen in octopus
- /a/yuriw-2020-05-29_15:51:00-rados-wip-yuri-testing-2020-05-28-2238-octopus-distro-basic-smithi/5103106...
- 04:28 AM Bug #20960: ceph_test_rados: mismatched version (due to pg import/export)
- ...
- 12:05 AM Bug #44510: osd/osd-recovery-space.sh TEST_recovery_test_simple failure
- Seen again:
http://pulpito.ceph.com/dzafman-2020-06-08_11:45:40-rados-wip-zafman-testing-distro-basic-smithi/5130114
06/08/2020
- 11:51 PM Bug #43888: osd/osd-bench.sh 'tell osd.N bench' hang
- Saw this in at least 17 jobs:
http://pulpito.ceph.com/dzafman-2020-06-08_11:45:40-rados-wip-zafman-testing-distro-... - 11:39 PM Bug #45944 (Triaged): osd/osd-markdown.sh: TEST_osd_stop failed
- This appears to be a rare condition when 15 seconds sleep was not enough.
- 09:14 PM Bug #45944 (Triaged): osd/osd-markdown.sh: TEST_osd_stop failed
- ...
- 09:10 PM Bug #45318: Health check failed: 2/6 mons down, quorum b,a,c,e (MON_DOWN)" in cluster log running...
- rados/multimon/{clusters/21 msgr-failures/few msgr/async-v1only no_pools objectstore/bluestore-comp-zlib rados suppor...
- 07:39 PM Bug #45943 (Fix Under Review): Ceph Monitor heartbeat grace period does not reset.
- 07:09 PM Bug #45943 (Resolved): Ceph Monitor heartbeat grace period does not reset.
- The heartbeat grace timer does not reset after cluster network is stable for multiple days.
Implement a mechanism to... - 06:31 PM Backport #45891 (In Progress): luminous: osd: pg stuck in waitactingchange when new acting set do...
- 06:22 PM Backport #45892 (In Progress): mimic: osd: pg stuck in waitactingchange when new acting set doesn...
- 12:51 PM Bug #45795 (Fix Under Review): PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_back...
- 07:01 AM Bug #45916: cls_lock: unlimited shared lock created by libradosstriper api let node crash
- add pr: https://github.com/ceph/ceph/pull/35467
- 06:50 AM Bug #45916 (Fix Under Review): cls_lock: unlimited shared lock created by libradosstriper api let...
- _Background: Ceph liminous are running on our production and a service uses libradosstriper api to access ceph._
W...
06/06/2020
- 08:45 AM Backport #45357 (Resolved): octopus: rados: Sharded OpWQ drops suicide_grace after waiting for work
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/34881
m... - 08:31 AM Backport #45884 (In Progress): octopus: osd-scrub-repair.sh: SyntaxError: invalid syntax
- 08:31 AM Backport #45882 (In Progress): octopus: Objecter: don't attempt to read from non-primary on EC pools
- 08:30 AM Backport #45779 (In Progress): octopus: rados/test_envlibrados_for_rocksdb.sh build failure (seen...
- 08:29 AM Backport #45775 (In Progress): octopus: build_incremental_map_msg missing incremental map while s...
- 08:28 AM Backport #45673 (In Progress): octopus: qa: powercycle: install task runs twice with double unwin...
- 12:53 AM Bug #44314 (In Progress): osd-backfill-stats.sh failing intermittently in TEST_backfill_sizeup_ou...
06/05/2020
- 10:52 PM Bug #44314: osd-backfill-stats.sh failing intermittently in TEST_backfill_sizeup_out() (degraded ...
It would be helpful to see the osd logs when this happens. We are expecting the following sequence to occur.
St...- 04:20 PM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
- /a/yuriw-2020-06-04_18:03:48-rados-wip-yuri2-testing-2020-06-03-2341-MASTER-distro-basic-smithi/5117777
- 04:17 PM Bug #45424: api_watch_notify_pp: [ FAILED ] LibRadosWatchNotifyECPP.WatchNotify watch_notify_cx...
- /a/yuriw-2020-06-04_18:03:48-rados-wip-yuri2-testing-2020-06-03-2341-MASTER-distro-basic-smithi/5117783
- 04:01 PM Bug #20960: ceph_test_rados: mismatched version (due to pg import/export)
- /a/yuriw-2020-06-04_18:03:48-rados-wip-yuri2-testing-2020-06-03-2341-MASTER-distro-basic-smithi/5118028
- 03:58 PM Bug #44517: osd/osd-backfill-space.sh TEST_backfill_multi_partial: pgs didn't go active+clean
- ...
06/04/2020
- 09:15 PM Bug #45868: rados_api_tests: LibRadosWatchNotify.AioWatchNotify2 fails
- Similar...
- 09:06 PM Bug #45661 (Fix Under Review): valgrind issue: UninitValue in ProtocolV2
- https://github.com/ceph/ceph/pull/35407
- 10:07 AM Bug #45661: valgrind issue: UninitValue in ProtocolV2
- Pin-pointed to a branch of @PrimaryLogPG::do_manifest_flush()@:...
- 08:36 AM Bug #45661: valgrind issue: UninitValue in ProtocolV2
- ...
- 06:08 PM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
- Ah, that makes sense. It should suffice to simply not populate_obc_watchers if replica.
- 05:42 PM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
- After more digging, this doesn't appear to be related to notifies being sent to replicas.
The issue seems to be wi... - 12:48 PM Backport #45890 (In Progress): nautilus: osd: pg stuck in waitactingchange when new acting set do...
- 11:58 AM Backport #45890 (Resolved): nautilus: osd: pg stuck in waitactingchange when new acting set doesn...
- https://github.com/ceph/ceph/pull/35389
- 12:44 PM Backport #45883 (In Progress): nautilus: osd-scrub-repair.sh: SyntaxError: invalid syntax
- 11:55 AM Backport #45883 (Resolved): nautilus: osd-scrub-repair.sh: SyntaxError: invalid syntax
- https://github.com/ceph/ceph/pull/35388
- 12:44 PM Backport #45780 (In Progress): nautilus: rados/test_envlibrados_for_rocksdb.sh build failure (see...
- 12:43 PM Backport #45776 (In Progress): nautilus: build_incremental_map_msg missing incremental map while ...
- 11:59 AM Backport #45892 (Rejected): mimic: osd: pg stuck in waitactingchange when new acting set doesn't ...
- https://github.com/ceph/ceph/pull/35484
- 11:59 AM Backport #45891 (Rejected): luminous: osd: pg stuck in waitactingchange when new acting set doesn...
- https://github.com/ceph/ceph/pull/35485
- 11:55 AM Backport #45884 (Resolved): octopus: osd-scrub-repair.sh: SyntaxError: invalid syntax
- https://github.com/ceph/ceph/pull/35445
- 11:55 AM Backport #45882 (Resolved): octopus: Objecter: don't attempt to read from non-primary on EC pools
- https://github.com/ceph/ceph/pull/35444
- 07:16 AM Bug #45871 (New): Incorrect (0) number of slow requests in health check
- ceph version 14.2.9-899-gc02349c600 (c02349c60052aaa6c7bd0c2270c7f7be16fab632) nautilus (stable)
Our cluster shows... - 12:24 AM Bug #40117 (Duplicate): PG stuck in WaitActingChange
- Fixed in https://tracker.ceph.com/issues/41190
- 12:21 AM Bug #41190 (Pending Backport): osd: pg stuck in waitactingchange when new acting set doesn't change
- 12:20 AM Bug #41236 (Resolved): cosbench failures in rados/perf
- 12:18 AM Bug #41550 (Resolved): os/bluestore: fadvise_flag leak in generate_transaction
- 12:17 AM Bug #41677 (Resolved): Cephmon:fix mon crash
- Fixed as a part of https://tracker.ceph.com/issues/41680.
- 12:14 AM Bug #41913 (Resolved): With auto scaler operating stopping an OSD can lead to COT crashing instea...
- 12:08 AM Bug #45356 (Resolved): nautilus: rados/upgrade/mimic-x-singleton failures due to mon_client_direc...
06/03/2020
- 09:06 PM Bug #45733 (Pending Backport): osd-scrub-repair.sh: SyntaxError: invalid syntax
- 06:12 PM Bug #45733: osd-scrub-repair.sh: SyntaxError: invalid syntax
- https://github.com/ceph/ceph/pull/35279 merged
- 08:50 PM Backport #45357: octopus: rados: Sharded OpWQ drops suicide_grace after waiting for work
- Dan Hill wrote:
> https://github.com/ceph/ceph/pull/34881
merged - 08:34 PM Bug #45868 (Resolved): rados_api_tests: LibRadosWatchNotify.AioWatchNotify2 fails
- ...
- 08:30 PM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
- /a/yuriw-2020-06-02_15:07:59-rados-wip-yuri7-testing-2020-06-01-2256-octopus-distro-basic-smithi/5113082 - octopus
- 04:44 AM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
- Moving this since it appears to be a problem with the mon_thrasher (or the MONs or monclients)....
- 02:44 PM Bug #45793 (Pending Backport): Objecter: don't attempt to read from non-primary on EC pools
- 01:24 PM Backport #41533: mimic: Move bluefs alloc size initialization log message to log level 1
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30219
m... - 12:59 PM Bug #45857 (New): crimson/alien_store: alienstore cannot open_collections
- setup: setting debug level 20 for bluestore, filestore and osd and using seastar with seastar_default_allocator + Rel...
- 01:50 AM Bug #9984: lttng_probe_unregister hangs on shutdown
- /a/yuriw-2020-05-30_02:18:17-rados-wip-yuri-master_5.29.20-distro-basic-smithi/5104372
Possibly an instance of thi...
06/02/2020
- 07:14 PM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
- I see. Watch being a write and notify being a read has always tripped me, but I guess I looked at it from the side e...
- 03:28 PM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
- Well, osd-side notifies are reads in that they don't result in mutation. I think lingerops in general probably shoul...
- 10:38 AM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
- Samuel Just wrote:
> Did that fire on the replica? At a guess, the problem is that notifies are being sent to repli... - 02:07 AM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
- It probably isn't https://tracker.ceph.com/issues/15391.
- 02:05 AM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
- Did that fire on the replica? At a guess, the problem is that notifies are being sent to replicas, which would be wr...
- 07:08 PM Bug #45802 (Resolved): Health check failed: Reduced data availability: PG_AVAILABILITY
- 06:19 PM Bug #45802 (Fix Under Review): Health check failed: Reduced data availability: PG_AVAILABILITY
- 06:17 PM Bug #45802 (Triaged): Health check failed: Reduced data availability: PG_AVAILABILITY
- Same root cause as https://tracker.ceph.com/issues/45619.
http://pulpito.ceph.com/teuthology-2020-05-30_03:05:02... - 07:16 AM Bug #45809 (New): When out a osd, the `MAX AVAIL` doesn't change.
- Environment: Luminous 12.2.12
I have a question about the pool's `MAX AVAIL` of `ceph df`.
When i out a osd, th... - 06:00 AM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
- /a/yuriw-2020-05-30_02:18:17-rados-wip-yuri-master_5.29.20-distro-basic-smithi/5104057
- 05:13 AM Bug #45661: valgrind issue: UninitValue in ProtocolV2
- /a/yuriw-2020-05-30_02:18:17-rados-wip-yuri-master_5.29.20-distro-basic-smithi/5103952
/a/yuriw-2020-05-30_02:18:17-...
06/01/2020
- 03:21 PM Bug #45802 (Resolved): Health check failed: Reduced data availability: PG_AVAILABILITY
- multiple RGW tests are failing on different branches, with:...
- 12:13 AM Bug #45796 (New): Ceph mon's sporadically report slow ops
- We have recently upgraded our cluster to 14.2.9 from 10.2.6 and are in the process of a rolling rebuild of many of th...
05/31/2020
- 01:20 PM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
- Sam, could you please take a look?
- 01:19 PM Bug #45795 (Resolved): PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().e...
- I'm running into this assert while trying to exercise krbd with replica reads (particularly balanced reads):...
- 12:34 PM Bug #45793: Objecter: don't attempt to read from non-primary on EC pools
- Marking only for octopus, since replica reads are safe for general use only in octopus.
- 12:32 PM Bug #45793 (Fix Under Review): Objecter: don't attempt to read from non-primary on EC pools
- 12:25 PM Bug #45793 (Resolved): Objecter: don't attempt to read from non-primary on EC pools
05/29/2020
- 05:31 PM Backport #45781 (Rejected): mimic: rados/test_envlibrados_for_rocksdb.sh build failure (seen in n...
- 05:31 PM Backport #45780 (Resolved): nautilus: rados/test_envlibrados_for_rocksdb.sh build failure (seen i...
- https://github.com/ceph/ceph/pull/35387
- 05:31 PM Backport #45779 (Resolved): octopus: rados/test_envlibrados_for_rocksdb.sh build failure (seen in...
- https://github.com/ceph/ceph/pull/35443
- 05:30 PM Backport #45776 (Resolved): nautilus: build_incremental_map_msg missing incremental map while sna...
- https://github.com/ceph/ceph/pull/35386
- 05:30 PM Backport #45775 (Resolved): octopus: build_incremental_map_msg missing incremental map while snap...
- https://github.com/ceph/ceph/pull/35442
- 05:16 AM Bug #45761 (Need More Info): mon_thrasher: "Error ENXIO: mon unavailable" during sync_force comma...
- /a/yuriw-2020-05-28_02:23:45-rados-wip-yuri-master_5.27.20-distro-basic-smithi/5097794...
- 04:11 AM Bug #45619 (Resolved): Health check failed: Reduced data availability: PG_AVAILABILITY
- 03:58 AM Bug #45760 (Resolved): osd-scrub-snaps.sh: TEST_scrub_snaps failed
05/28/2020
- 10:48 PM Bug #45760 (Fix Under Review): osd-scrub-snaps.sh: TEST_scrub_snaps failed
- 09:12 PM Bug #45760 (Resolved): osd-scrub-snaps.sh: TEST_scrub_snaps failed
- ...
- 09:39 PM Bug #45660 (Resolved): osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed
- 12:42 AM Bug #45660 (Fix Under Review): osd-scrub-repair.sh:TEST_corrupt_scrub_replicated failed
- 08:57 PM Bug #45619 (Fix Under Review): Health check failed: Reduced data availability: PG_AVAILABILITY
- 01:52 PM Bug #41399 (Resolved): Move bluefs alloc size initialization log message to log level 1
- 01:52 PM Backport #41533 (Resolved): mimic: Move bluefs alloc size initialization log message to log level 1
- 07:17 AM Bug #45606 (Pending Backport): build_incremental_map_msg missing incremental map while snaptrim o...
- 06:38 AM Bug #44595: cache tiering: Error: oid 48 copy_from 493 returned error code -2
- ...
- 06:08 AM Bug #45661: valgrind issue: UninitValue in ProtocolV2
- @/a/kchai-2020-05-27_23:43:53-rados-wip-kefu-testing-2020-05-27-2242-distro-basic-smithi/5097299/remote/*/log/valgrin...
- 02:10 AM Bug #45661: valgrind issue: UninitValue in ProtocolV2
- /a/yuriw-2020-05-24_19:30:40-rados-wip-yuri-master_5.24.20-distro-basic-smithi/5088037
/a/yuriw-2020-05-24_19:30:40-...
Also available in: Atom