Activity
From 06/03/2020 to 07/02/2020
07/02/2020
- 04:46 PM Bug #46285 (Rejected): osd: error from smartctl is always reported as invalid JSON
- turns out the report was from an earlier version (it did not contain the 'output' key)
- 04:37 PM Bug #46179 (Duplicate): Health check failed: Reduced data availability: PG_AVAILABILITY
- 04:36 PM Bug #46225 (Duplicate): Health check failed: 1 osds down (OSD_DOWN)
- 01:35 PM Bug #46264: mon: check for mismatched daemon versions
- I have completed a function called check_daemon_version located in src/mon/Monitor.cc This function goes through mon_...
- 09:48 AM Bug #44755 (Pending Backport): Create stronger affinity between drivegroup specs and osd daemons
- 09:04 AM Bug #46178 (Duplicate): slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+know...
- 08:56 AM Bug #46180 (Resolved): qa: Scrubbing terminated -- not all pgs were active and clean.
- Will be cherry-picked into https://github.com/ceph/ceph/pull/35720 and https://github.com/ceph/ceph/pull/35733.
07/01/2020
- 10:55 PM Bug #46325 (Rejected): A pool at size 3 should have a min_size 2
The get_osd_pool_default_min_size() calculation of size - size/2 for the min_size should special case size 3 and ju...- 10:03 PM Bug #37509 (Can't reproduce): require past_interval bounds mismatch due to osd oldest_map
- 09:58 PM Bug #23879 (Can't reproduce): test_mon_osdmap_prune.sh fails
- 09:57 PM Bug #23857 (Can't reproduce): flush (manifest) vs async recovery causes out of order op
- 09:56 PM Bug #23828 (Can't reproduce): ec gen object leaks into different filestore collection just after ...
- 09:53 PM Bug #23117: PGs stuck in "activating" after osd_max_pg_per_osd_hard_ratio has been exceeded once
- We should try to make it more obvious when this limit is hit. I thought we added something in the cluster logs about ...
- 09:49 PM Documentation #46324 (New): Sepia VPN Client Access documentation is out-of-date
- https://wiki.sepia.ceph.com/doku.php?id=vpnaccess#vpn_client_access
There are two issues that I noticed that must ... - 09:49 PM Bug #20960 (Can't reproduce): ceph_test_rados: mismatched version (due to pg import/export)
- The thrash_cache_writeback_proxy_none failure has a different root cause, opened a new tracker for it https://tracker...
- 09:47 PM Bug #46323 (Resolved): thrash_cache_writeback_proxy_none: FAILED ceph_assert(version == old_value...
- ...
- 09:35 PM Bug #19700 (Closed): OSD remained up despite cluster network being inactive?
- Please reopen this bug if the issue is seen in nautilus or newer releases.
- 09:22 PM Bug #43882 (Can't reproduce): osd to mon connection lost, osd stuck down
- 09:16 PM Bug #44631 (Can't reproduce): ceph pg dump error code 124
- 07:58 PM Bug #46275: Cancellation of on-going scrubs
- We may be able to easily terminate scrubbing in between chunks if the noscrub/nodeep-scrub get set.
I will test this. - 07:56 PM Bug #46275 (In Progress): Cancellation of on-going scrubs
- 07:32 PM Backport #46095 (Resolved): octopus: Issue health status warning if num_shards_repaired exceeds s...
- 07:22 PM Backport #46115: octopus: Add statfs output to ceph-objectstore-tool
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35715
merged - 06:00 PM Bug #46318 (Need More Info): mon_recovery: quorum_status times out
- ...
- 05:05 PM Bug #46285: osd: error from smartctl is always reported as invalid JSON
- Which version is this cluster running?
I would expect to see this "output" key in the command's output:
https://g... - 02:43 AM Bug #46285 (Rejected): osd: error from smartctl is always reported as invalid JSON
- When smartctl returns an error, the osd always reports it as invalid json. We meant to give a better error, but the c...
- 02:51 AM Backport #46287 (Rejected): nautilus: mon: log entry with garbage generated by bad memory access
- 02:51 AM Backport #46286 (Resolved): octopus: mon: log entry with garbage generated by bad memory access
- https://github.com/ceph/ceph/pull/36035
06/30/2020
- 09:27 PM Bug #46222 (Won't Fix): Cbt installation task for cosbench fails.
- The root cause of this issue is that we put an older version of cosbench in https://drop.ceph.com/qa/ after the recen...
- 01:07 PM Bug #46222: Cbt installation task for cosbench fails.
http://qa-proxy.ceph.com/teuthology/ideepika-2020-06-29_08:23:54-rados-wip-deepika-testing-2020-06-25-2058-distro-b...- 05:37 PM Bug #46216 (Pending Backport): mon: log entry with garbage generated by bad memory access
- 04:41 PM Bug #46216 (Fix Under Review): mon: log entry with garbage generated by bad memory access
- 04:23 PM Documentation #46279 (New): various matters related to ceph mon and orch cephadm -- this is sever...
- <andyg5> Hi, I am trying to move the MONitors over tothe public network, and I'm not sure how to do it. I have setu...
- 03:07 PM Bug #46224 (Resolved): Health check failed: 4 mgr modules have failed (MGR_MODULE_ERROR)
- 01:52 PM Bug #46275 (Resolved): Cancellation of on-going scrubs
- Although it's possible to prevent initiating new scrubs, we don't have a facility for terminating already on-going on...
- 08:30 AM Bug #46264: mon: check for mismatched daemon versions
- Hm. what do yo expect? Upgrade scenarios can become complicated with more than two versions running at the same time ...
06/29/2020
- 09:22 PM Bug #46266 (Need More Info): Monitor crashed in creating pool in CrushTester::test_with_fork()
- Hi. I was creating a new pool and one of my monitors crashed....
- 06:44 PM Bug #43553: mon: client mon_status fails
- /ceph/teuthology-archive/yuriw-2020-06-25_22:31:00-fs-octopus-distro-basic-smithi/5180260/teuthology.log
- 06:10 PM Bug #46264 (Resolved): mon: check for mismatched daemon versions
- There is currently no test to check if the daemon are all running the same version of ceph
- 05:44 PM Bug #20960: ceph_test_rados: mismatched version (due to pg import/export)
- /a/dis-2020-06-28_18:43:20-rados-wip-msgr21-fix-reuse-rebuildci-distro-basic-smithi/5186890
- 05:36 PM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
- /a/dis-2020-06-28_18:43:20-rados-wip-msgr21-fix-reuse-rebuildci-distro-basic-smithi/5186759
- 05:02 PM Backport #46262 (Resolved): nautilus: larger osd_scrub_max_preemptions values cause Floating poin...
- https://github.com/ceph/ceph/pull/37470
- 05:01 PM Backport #46261 (Resolved): octopus: larger osd_scrub_max_preemptions values cause Floating point...
- https://github.com/ceph/ceph/pull/36034
- 12:26 PM Bug #46178: slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+known_if_redirec...
- https://pulpito.ceph.com/swagner-2020-06-29_09:26:42-rados:cephadm-wip-swagner-testing-2020-06-26-1524-distro-basic-s...
- 08:53 AM Bug #44352: pool listings are slow after deleting objects
- This was on the latest nautilus release at the time, the DB should have been on SSD but I don't remember. But good po...
- 08:50 AM Bug #45381: unfound objects in erasure-coded CephFS
- No, this setup is luckily without any cache tiering. It's a completely standard setup with replicated cephfs_metadata...
06/28/2020
- 10:45 AM Bug #46180 (Fix Under Review): qa: Scrubbing terminated -- not all pgs were active and clean.
- 05:17 AM Bug #46024 (Pending Backport): larger osd_scrub_max_preemptions values cause Floating point excep...
06/27/2020
- 04:20 PM Bug #46242 (New): rados -p default.rgw.buckets.data returning over millions objects No such file ...
- Hi Dev,
Due sharding / s3 bugs we synced the bucket of customer to new ones.
Once we tryed to delete we're unab... - 03:15 PM Bug #44595: cache tiering: Error: oid 48 copy_from 493 returned error code -2
- /a/kchai-2020-06-27_07:37:00-rados-wip-kefu-testing-2020-06-27-1407-distro-basic-smithi/5183671/
- 08:25 AM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
- ...
06/26/2020
- 07:27 PM Bug #46178: slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+known_if_redirec...
- http://pulpito.ceph.com/mgfritch-2020-06-26_02:07:27-rados-wip-mgfritch-testing-2020-06-25-1855-distro-basic-smithi/...
- 06:21 PM Bug #46180: qa: Scrubbing terminated -- not all pgs were active and clean.
- Here's a reliable reproducer for the issue:
-s rados/singleton-nomsgr -c master --filter 'all/health-warnings rado... - 06:50 AM Bug #46180: qa: Scrubbing terminated -- not all pgs were active and clean.
- I think it has to do with reconnect handling and how connections are reused.
This part of ProtocolV2 is pretty fra... - 05:04 AM Bug #46180: qa: Scrubbing terminated -- not all pgs were active and clean.
- This is a msgr2.1 issue.
- 05:48 PM Bug #46225 (Triaged): Health check failed: 1 osds down (OSD_DOWN)
- 05:39 PM Bug #46225: Health check failed: 1 osds down (OSD_DOWN)
- Also, related to https://tracker.ceph.com/issues/46180...
- 10:57 AM Bug #46225 (Duplicate): Health check failed: 1 osds down (OSD_DOWN)
- /a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/5176410
2020-06-2... - 05:34 PM Bug #46227 (Duplicate): Segmentation fault when running ceph_test_keyvaluedb command as part of a...
- Duplicate of https://tracker.ceph.com/issues/46054
- 11:19 AM Bug #46227 (Duplicate): Segmentation fault when running ceph_test_keyvaluedb command as part of a...
- /a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/5176446
Unfortunate... - 05:31 PM Bug #46179 (Triaged): Health check failed: Reduced data availability: PG_AVAILABILITY
- 05:11 PM Bug #46179: Health check failed: Reduced data availability: PG_AVAILABILITY
- This failure is different from the one seen in the RGW suite earlier due to upmap. This is related to https://tracker...
- 07:32 AM Bug #46179: Health check failed: Reduced data availability: PG_AVAILABILITY
- /a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/
job ID: 5176200
F... - 05:31 PM Bug #46224 (Fix Under Review): Health check failed: 4 mgr modules have failed (MGR_MODULE_ERROR)
- 10:44 AM Bug #46224 (Resolved): Health check failed: 4 mgr modules have failed (MGR_MODULE_ERROR)
- /a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/5176341 and
/a/ssesha... - 05:30 PM Bug #46222 (In Progress): Cbt installation task for cosbench fails.
- 09:03 AM Bug #46222: Cbt installation task for cosbench fails.
- See /a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/5176322 as well
- 09:00 AM Bug #46222 (Won't Fix): Cbt installation task for cosbench fails.
- /a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/5176309
2020-06-2... - 04:48 PM Feature #46238 (New): raise a HEALTH warn, if OSDs use the cluster_network for the front
- Related to: https://tracker.ceph.com/issues/46230
- 12:17 PM Backport #46229 (In Progress): octopus: Ceph Monitor heartbeat grace period does not reset.
- 12:14 PM Backport #46229 (New): octopus: Ceph Monitor heartbeat grace period does not reset.
- 11:48 AM Backport #46229 (Resolved): octopus: Ceph Monitor heartbeat grace period does not reset.
- https://github.com/ceph/ceph/pull/35799
- 12:13 PM Backport #46228 (In Progress): nautilus: Ceph Monitor heartbeat grace period does not reset.
- 12:13 PM Backport #46228 (New): nautilus: Ceph Monitor heartbeat grace period does not reset.
- 11:47 AM Backport #46228 (Resolved): nautilus: Ceph Monitor heartbeat grace period does not reset.
- https://github.com/ceph/ceph/pull/35798
- 11:43 AM Bug #45943 (Pending Backport): Ceph Monitor heartbeat grace period does not reset.
- 11:14 AM Documentation #46203 (Resolved): docs.ceph.com is down
- docs.ceph.com returned four hours later.
- 08:40 AM Bug #24057: cbt fails to copy results to the archive dir
- Observed the issue during this run:
/a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distr... - 07:28 AM Bug #44595: cache tiering: Error: oid 48 copy_from 493 returned error code -2
- /a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/
job ID: 5176184
... - 07:18 AM Bug #45441: rados: Health check failed: 1/3 mons down, quorum a,c (MON_DOWN)" in cluster log'
- Observing the issue during this run:
/a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-dist... - 04:38 AM Bug #46125: ceph mon memory increasing
- I will try with default settings for the monitor. With current config file parameters, the monitor is using 1GB.
I...
06/25/2020
- 11:56 PM Bug #46216 (Resolved): mon: log entry with garbage generated by bad memory access
- Causes the mgr to segmentation fault:...
- 10:27 PM Bug #46178: slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+known_if_redirec...
- /a/yuvalif-2020-06-23_14:40:15-rgw-wip-yuval-test-35331-35155-distro-basic-smithi/5173465
Seems very likely to hav... - 09:09 PM Bug #46125 (Need More Info): ceph mon memory increasing
- Can you try with the default settings for the monitor? What level of memory usage are you seeing exactly?
There is... - 07:40 PM Bug #46180: qa: Scrubbing terminated -- not all pgs were active and clean.
- The common thing in all of these is that the tests are all failing while running the ceph task, no thrashing or anyth...
- 03:15 PM Bug #46180: qa: Scrubbing terminated -- not all pgs were active and clean.
- Saw the same error during this run:
http://pulpito.ceph.com/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-... - 05:29 PM Bug #46211 (Duplicate): qa: pools stuck in creating
- 05:26 PM Bug #46211 (Duplicate): qa: pools stuck in creating
- During cluster setup for the CephFS suites, we see this failure:...
- 03:44 PM Bug #39039: mon connection reset, command not resent
- Hitting this issue on octopus, Fedora 32:...
- 02:18 PM Documentation #46203 (In Progress): docs.ceph.com is down
- I'm afraid this is outside my control. We're at the mercy of our cloud provider. Pretty sure it's this: http://trav...
- 07:49 AM Documentation #46203 (Resolved): docs.ceph.com is down
- docs.ceph.com has been down since at the latest 1735 aest 25 Jun 2020.
https://downforeveryoneorjustme.com/docs.ce...
06/24/2020
- 02:16 PM Bug #46180 (Resolved): qa: Scrubbing terminated -- not all pgs were active and clean.
- Seeing several test failures in the rgw suite:...
- 02:09 PM Bug #46179 (Duplicate): Health check failed: Reduced data availability: PG_AVAILABILITY
- multiple RGW tests are failing on different branches, with:...
- 01:20 PM Bug #46178: slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+known_if_redirec...
- http://pulpito.ceph.com/swagner-2020-06-24_11:30:44-rados:cephadm-wip-swagner3-testing-2020-06-24-1025-distro-basic-s...
- 01:19 PM Bug #46178: slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+known_if_redirec...
- http://pulpito.ceph.com/swagner-2020-06-24_11:30:44-rados:cephadm-wip-swagner3-testing-2020-06-24-1025-distro-basic-s...
- 01:16 PM Bug #46178: slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+known_if_redirec...
- http://pulpito.ceph.com/swagner-2020-06-24_11:30:44-rados:cephadm-wip-swagner3-testing-2020-06-24-1025-distro-basic-s...
- 12:57 PM Bug #46178 (Duplicate): slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+know...
- Saw this error yesterday for the first time:
http://pulpito.ceph.com/swagner-2020-06-23_13:15:09-rados:cephadm-wip... - 10:37 AM Backport #45676 (Resolved): octopus: rados/test_envlibrados_for_rocksdb.sh fails on Xenial (seen ...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35236
m... - 02:47 AM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
- ...
- 01:36 AM Backport #46164 (In Progress): nautilus: osd: make message cap option usable again
- 01:13 AM Backport #46164 (Resolved): nautilus: osd: make message cap option usable again
- https://github.com/ceph/ceph/pull/35738
- 01:28 AM Backport #46165 (In Progress): octopus: osd: make message cap option usable again
- 01:13 AM Backport #46165 (Resolved): octopus: osd: make message cap option usable again
- https://github.com/ceph/ceph/pull/35737
- 12:18 AM Bug #46143 (Pending Backport): osd: make message cap option usable again
06/23/2020
- 08:11 PM Backport #45676: octopus: rados/test_envlibrados_for_rocksdb.sh fails on Xenial (seen in nautilus)
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35236
merged - 12:15 AM Bug #45944: osd/osd-markdown.sh: TEST_osd_stop failed
- ...
06/22/2020
- 09:59 PM Backport #46115 (In Progress): octopus: Add statfs output to ceph-objectstore-tool
- 09:37 PM Backport #46116 (In Progress): nautilus: Add statfs output to ceph-objectstore-tool
- 06:52 PM Bug #45944: osd/osd-markdown.sh: TEST_osd_stop failed
- /a/teuthology-2020-06-19_07:01:02-rados-master-distro-basic-smithi/5164221
- 05:53 PM Bug #46143 (Fix Under Review): osd: make message cap option usable again
- 05:36 PM Bug #46143 (In Progress): osd: make message cap option usable again
- 05:18 PM Bug #46143 (Resolved): osd: make message cap option usable again
- "This reverts commit 45d5ac3.
Without a msg throttler, we can't change osd_client_message_cap cap
online. The thr... - 04:57 PM Bug #41154: osd: pg unknown state
- I again have this problem....
- 03:19 PM Documentation #46141 (New): Document automatic OSD deployment behavior better
- Make certain that the documentation notifies readers that OSDs are automatically created, so that they are not caught...
- 09:12 AM Bug #46137: Monitor leader is marking multiple osd's down
- Every few mins multiple osd's are going down and coming back up which is causing recovery of data, This is occurring ...
- 09:07 AM Bug #46137 (New): Monitor leader is marking multiple osd's down
- My ceph cluster consist of 5 Mon and 58 DN with 1302 total osd's (HDD's) with 12.2.8 Luminous (stable) version and Fi...
- 06:02 AM Bug #45943: Ceph Monitor heartbeat grace period does not reset.
- Updates from testing the fix:
OSD failure before being marked down:...
06/21/2020
- 02:17 PM Feature #24099: osd: Improve workflow when creating OSD on raw block device if there was bluestor...
- John Spray wrote:
> This seems like an odd idea -- if someone is doing OSD creation by hand, why would they want to ... - 12:25 PM Documentation #46099: document statfs operation for ceph-objectstore-tool
if (op == "statfs") {
store_statfs_t statsbuf;
ret = fs->statfs(&statsbuf);
if (ret < 0) {
...- 12:10 PM Documentation #46126 (New): RGW docs lack an explanation of how permissions management works, esp...
- <dirtwash> you know its sshitty protocol and design if obvious things arent visible and default behavior doesnt work
... - 08:02 AM Bug #46125: ceph mon memory increasing
- Hi,
I have deployed ceph single node cluster.
ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) na... - 07:13 AM Bug #46125 (Need More Info): ceph mon memory increasing
- Hi,
I have deployed ceph single node cluster.
ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) ...
06/20/2020
- 10:12 PM Backport #46096 (In Progress): nautilus: Issue health status warning if num_shards_repaired excee...
- 10:09 PM Backport #46095 (In Progress): octopus: Issue health status warning if num_shards_repaired exceed...
- 09:57 PM Bug #45793 (Resolved): Objecter: don't attempt to read from non-primary on EC pools
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 09:56 PM Backport #45882 (Resolved): octopus: Objecter: don't attempt to read from non-primary on EC pools
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35444
m... - 09:56 PM Backport #45775 (Resolved): octopus: build_incremental_map_msg missing incremental map while snap...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35442
m... - 07:59 AM Documentation #46120 (Resolved): Improve ceph-objectstore-tool documentation
- https://github.com/ceph/ceph/pull/33823
There are a number of comments by David Zafman that I failed to include in... - 04:20 AM Bug #46065 (Resolved): sudo missing from command in monitor-bootstrapping procedure
06/19/2020
- 04:36 PM Backport #46116 (Resolved): nautilus: Add statfs output to ceph-objectstore-tool
- https://github.com/ceph/ceph/pull/35713
- 04:36 PM Backport #46115 (Resolved): octopus: Add statfs output to ceph-objectstore-tool
- https://github.com/ceph/ceph/pull/35715
- 05:00 AM Documentation #46099 (New): document statfs operation for ceph-objectstore-tool
- https://github.com/ceph/ceph/pull/35632
https://github.com/ceph/ceph/pull/33823
The affected file (I think) is ...
06/18/2020
- 11:26 PM Bug #46064 (Pending Backport): Add statfs output to ceph-objectstore-tool
- 01:13 AM Bug #46064 (Fix Under Review): Add statfs output to ceph-objectstore-tool
- 01:08 AM Bug #46064 (In Progress): Add statfs output to ceph-objectstore-tool
- 01:07 AM Bug #46064 (Resolved): Add statfs output to ceph-objectstore-tool
This will help diagnose out of space crashes:...- 10:32 PM Backport #45882: octopus: Objecter: don't attempt to read from non-primary on EC pools
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35444
merged - 10:31 PM Backport #45775: octopus: build_incremental_map_msg missing incremental map while snaptrim or bac...
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35442
merged - 08:08 PM Backport #46096 (Resolved): nautilus: Issue health status warning if num_shards_repaired exceeds ...
- https://github.com/ceph/ceph/pull/36379
- 08:08 PM Backport #46095 (Resolved): octopus: Issue health status warning if num_shards_repaired exceeds s...
- https://github.com/ceph/ceph/pull/35685
- 08:06 PM Backport #46090 (Resolved): nautilus: PG merge: FAILED ceph_assert(info.history.same_interval_sin...
- https://github.com/ceph/ceph/pull/36161
- 08:06 PM Backport #46089 (Resolved): octopus: PG merge: FAILED ceph_assert(info.history.same_interval_sinc...
- https://github.com/ceph/ceph/pull/36033
- 08:06 PM Backport #46086 (Resolved): octopus: osd: wakeup all threads of shard rather than one thread
- https://github.com/ceph/ceph/pull/36032
- 10:40 AM Bug #46071 (New): potential rocksdb failure: few osd's service not starting up after node reboot....
- Data node went down abruptly due to issue with SPS-BD Smart Array PCIe SAS Expander, once hardware was changed node c...
- 03:30 AM Bug #46065 (Fix Under Review): sudo missing from command in monitor-bootstrapping procedure
- https://github.com/ceph/ceph/pull/35635
- 03:25 AM Bug #46065 (Resolved): sudo missing from command in monitor-bootstrapping procedure
- Where:
https://docs.ceph.com/docs/master/install/manual-deployment/#monitor-bootstrapping
What:
<badone> https:/...
06/17/2020
- 09:21 PM Bug #45991 (Pending Backport): PG merge: FAILED ceph_assert(info.history.same_interval_since != 0)
- 11:42 AM Bug #45991 (Fix Under Review): PG merge: FAILED ceph_assert(info.history.same_interval_since != 0)
- 09:19 PM Bug #46024 (Fix Under Review): larger osd_scrub_max_preemptions values cause Floating point excep...
- 09:19 PM Bug #46043 (Need More Info): osd/ECBackend.cc: 1551: FAILED assert(!(*m).is_missing(hoid))
- It is really hard to say what caused this assert without enough debug logging and I doubt we will able to reproduce t...
- 07:41 AM Bug #46043 (Need More Info): osd/ECBackend.cc: 1551: FAILED assert(!(*m).is_missing(hoid))
- We observed this crush on on one of the customer servers:...
- 05:22 PM Feature #41564 (Pending Backport): Issue health status warning if num_shards_repaired exceeds som...
- 03:35 PM Bug #46053 (Resolved): osd: wakeup all threads of shard rather than one thread
06/16/2020
- 01:52 AM Bug #46024 (Resolved): larger osd_scrub_max_preemptions values cause Floating point exception
A non-default large osd_scrub_max_preemptions value (e.g., 32) would cause scrubber.preempt_divisor underflow and...
06/15/2020
- 07:25 PM Backport #46018 (Resolved): octopus: ceph_test_rados_watch_notify hang
- 07:25 PM Backport #46017 (Resolved): nautilus: ceph_test_rados_watch_notify hang
- https://github.com/ceph/ceph/pull/36031
- 07:24 PM Backport #46016 (Resolved): octopus: osd-backfill-stats.sh failing intermittently in TEST_backfil...
- https://github.com/ceph/ceph/pull/36030
- 07:22 PM Bug #45612 (Resolved): qa: powercycle: install task runs twice with double unwind causing fatal e...
- While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ...
- 07:21 PM Backport #46007 (Resolved): octopus: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recover...
- https://github.com/ceph/ceph/pull/36029
06/13/2020
- 05:26 AM Bug #45991 (Resolved): PG merge: FAILED ceph_assert(info.history.same_interval_since != 0)
- http://qa-proxy.ceph.com/teuthology/xxg-2020-06-13_00:34:59-rados:thrash-wip-nautilus-nnnn-distro-basic-smithi/514318...
06/12/2020
- 02:50 PM Backport #45884: octopus: osd-scrub-repair.sh: SyntaxError: invalid syntax
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35445
m... - 03:50 AM Backport #45884 (Resolved): octopus: osd-scrub-repair.sh: SyntaxError: invalid syntax
- 12:31 AM Backport #45884: octopus: osd-scrub-repair.sh: SyntaxError: invalid syntax
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35445
merged - 02:50 PM Backport #45779: octopus: rados/test_envlibrados_for_rocksdb.sh build failure (seen in nautilus)
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35443
m... - 03:47 AM Backport #45779 (Resolved): octopus: rados/test_envlibrados_for_rocksdb.sh build failure (seen in...
- 12:30 AM Backport #45779: octopus: rados/test_envlibrados_for_rocksdb.sh build failure (seen in nautilus)
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35443
merged - 02:49 PM Backport #45673 (Resolved): octopus: qa: powercycle: install task runs twice with double unwind c...
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35441
m... - 12:30 AM Backport #45673: octopus: qa: powercycle: install task runs twice with double unwind causing fata...
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35441
merged - 09:33 AM Documentation #45988: [doc/os]: Centos 8 is not listed even though it is supported
- I confirm that there is a row in this table that mentions Centos 8, and that this line appears when I build the docs ...
- 09:24 AM Documentation #45988 (Resolved): [doc/os]: Centos 8 is not listed even though it is supported
- 19
https://docs.ceph.com/docs/master/releases/octopus/
https://docs.ceph.com/docs/octopus/start/os-recommendations/...
06/11/2020
- 05:23 PM Backport #45780: nautilus: rados/test_envlibrados_for_rocksdb.sh build failure (seen in nautilus)
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35387
m... - 01:26 AM Bug #45795 (Pending Backport): PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_back...
- 01:21 AM Bug #38357: ClsLock.TestExclusiveEphemeralStealEphemeral failed
- ...
06/10/2020
- 09:30 PM Bug #45916 (Fix Under Review): cls_lock: unlimited shared lock created by libradosstriper api let...
- 09:25 PM Bug #43861 (Pending Backport): ceph_test_rados_watch_notify hang
- Let's remove these tests from the stable branches too.
- 09:02 AM Feature #41564 (In Progress): Issue health status warning if num_shards_repaired exceeds some thr...
- 12:25 AM Bug #44314 (Pending Backport): osd-backfill-stats.sh failing intermittently in TEST_backfill_size...
06/09/2020
- 09:34 PM Backport #45780 (Resolved): nautilus: rados/test_envlibrados_for_rocksdb.sh build failure (seen i...
- 02:58 PM Backport #45780: nautilus: rados/test_envlibrados_for_rocksdb.sh build failure (seen in nautilus)
- Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35387
merged - 09:02 PM Bug #42716: Pool creation error message is hidden on FileStore-backed pools
- That wasn't the initial issue reported.
What happen if you run "ceph osd pool create foo2 2048" instead ? (assumin... - 07:38 PM Bug #42716 (Resolved): Pool creation error message is hidden on FileStore-backed pools
- closing this as already resolved....
- 02:41 PM Bug #36337: OSDs crash with failed assertion in PGLog::merge_log as logs do not overlap
- ...
- 02:41 PM Bug #45956 (New): verify takes forever to finish
- rados/verify/{centos_latest.yaml ceph.yaml clusters/{fixed-2.yaml openstack.yaml} d-thrash/default/{default.yaml thra...
- 12:24 PM Bug #45661 (Resolved): valgrind issue: UninitValue in ProtocolV2
- In @master@ the PR #35407 has been closed in favor of https://github.com/ceph/ceph/pull/35186.
#35407 still might be... - 06:34 AM Bug #45948 (Duplicate): ceph_test_rados_delete_pools_parallel failed with error -2 on nautilus
- Oops, this is a dup of #43887
- 06:31 AM Bug #45948 (Duplicate): ceph_test_rados_delete_pools_parallel failed with error -2 on nautilus
- /a/yuriw-2020-06-08_16:06:08-rados-wip-yuri2-testing-2020-06-08-1458-nautilus-distro-basic-smithi/5129541...
- 06:06 AM Bug #45947: ceph_test_rados_watch_notify hang seen in nautilus
- Note https://tracker.ceph.com/issues/43861 removed this test from master because it was hanging.
- 06:02 AM Bug #45947: ceph_test_rados_watch_notify hang seen in nautilus
- This is very similar to what is seen in #45946 so they may be related.
- 06:01 AM Bug #45947 (New): ceph_test_rados_watch_notify hang seen in nautilus
- /a/yuriw-2020-06-08_16:06:08-rados-wip-yuri2-testing-2020-06-08-1458-nautilus-distro-basic-smithi/5129565...
- 05:32 AM Bug #45946 (New): ceph_test_rados_delete_pools_parallel hang seen in octopus
- /a/yuriw-2020-05-29_15:51:00-rados-wip-yuri-testing-2020-05-28-2238-octopus-distro-basic-smithi/5103106...
- 04:28 AM Bug #20960: ceph_test_rados: mismatched version (due to pg import/export)
- ...
- 12:05 AM Bug #44510: osd/osd-recovery-space.sh TEST_recovery_test_simple failure
- Seen again:
http://pulpito.ceph.com/dzafman-2020-06-08_11:45:40-rados-wip-zafman-testing-distro-basic-smithi/5130114
06/08/2020
- 11:51 PM Bug #43888: osd/osd-bench.sh 'tell osd.N bench' hang
- Saw this in at least 17 jobs:
http://pulpito.ceph.com/dzafman-2020-06-08_11:45:40-rados-wip-zafman-testing-distro-... - 11:39 PM Bug #45944 (Triaged): osd/osd-markdown.sh: TEST_osd_stop failed
- This appears to be a rare condition when 15 seconds sleep was not enough.
- 09:14 PM Bug #45944 (Triaged): osd/osd-markdown.sh: TEST_osd_stop failed
- ...
- 09:10 PM Bug #45318: Health check failed: 2/6 mons down, quorum b,a,c,e (MON_DOWN)" in cluster log running...
- rados/multimon/{clusters/21 msgr-failures/few msgr/async-v1only no_pools objectstore/bluestore-comp-zlib rados suppor...
- 07:39 PM Bug #45943 (Fix Under Review): Ceph Monitor heartbeat grace period does not reset.
- 07:09 PM Bug #45943 (Resolved): Ceph Monitor heartbeat grace period does not reset.
- The heartbeat grace timer does not reset after cluster network is stable for multiple days.
Implement a mechanism to... - 06:31 PM Backport #45891 (In Progress): luminous: osd: pg stuck in waitactingchange when new acting set do...
- 06:22 PM Backport #45892 (In Progress): mimic: osd: pg stuck in waitactingchange when new acting set doesn...
- 12:51 PM Bug #45795 (Fix Under Review): PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_back...
- 07:01 AM Bug #45916: cls_lock: unlimited shared lock created by libradosstriper api let node crash
- add pr: https://github.com/ceph/ceph/pull/35467
- 06:50 AM Bug #45916 (Fix Under Review): cls_lock: unlimited shared lock created by libradosstriper api let...
- _Background: Ceph liminous are running on our production and a service uses libradosstriper api to access ceph._
W...
06/06/2020
- 08:45 AM Backport #45357 (Resolved): octopus: rados: Sharded OpWQ drops suicide_grace after waiting for work
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/34881
m... - 08:31 AM Backport #45884 (In Progress): octopus: osd-scrub-repair.sh: SyntaxError: invalid syntax
- 08:31 AM Backport #45882 (In Progress): octopus: Objecter: don't attempt to read from non-primary on EC pools
- 08:30 AM Backport #45779 (In Progress): octopus: rados/test_envlibrados_for_rocksdb.sh build failure (seen...
- 08:29 AM Backport #45775 (In Progress): octopus: build_incremental_map_msg missing incremental map while s...
- 08:28 AM Backport #45673 (In Progress): octopus: qa: powercycle: install task runs twice with double unwin...
- 12:53 AM Bug #44314 (In Progress): osd-backfill-stats.sh failing intermittently in TEST_backfill_sizeup_ou...
06/05/2020
- 10:52 PM Bug #44314: osd-backfill-stats.sh failing intermittently in TEST_backfill_sizeup_out() (degraded ...
It would be helpful to see the osd logs when this happens. We are expecting the following sequence to occur.
St...- 04:20 PM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
- /a/yuriw-2020-06-04_18:03:48-rados-wip-yuri2-testing-2020-06-03-2341-MASTER-distro-basic-smithi/5117777
- 04:17 PM Bug #45424: api_watch_notify_pp: [ FAILED ] LibRadosWatchNotifyECPP.WatchNotify watch_notify_cx...
- /a/yuriw-2020-06-04_18:03:48-rados-wip-yuri2-testing-2020-06-03-2341-MASTER-distro-basic-smithi/5117783
- 04:01 PM Bug #20960: ceph_test_rados: mismatched version (due to pg import/export)
- /a/yuriw-2020-06-04_18:03:48-rados-wip-yuri2-testing-2020-06-03-2341-MASTER-distro-basic-smithi/5118028
- 03:58 PM Bug #44517: osd/osd-backfill-space.sh TEST_backfill_multi_partial: pgs didn't go active+clean
- ...
06/04/2020
- 09:15 PM Bug #45868: rados_api_tests: LibRadosWatchNotify.AioWatchNotify2 fails
- Similar...
- 09:06 PM Bug #45661 (Fix Under Review): valgrind issue: UninitValue in ProtocolV2
- https://github.com/ceph/ceph/pull/35407
- 10:07 AM Bug #45661: valgrind issue: UninitValue in ProtocolV2
- Pin-pointed to a branch of @PrimaryLogPG::do_manifest_flush()@:...
- 08:36 AM Bug #45661: valgrind issue: UninitValue in ProtocolV2
- ...
- 06:08 PM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
- Ah, that makes sense. It should suffice to simply not populate_obc_watchers if replica.
- 05:42 PM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
- After more digging, this doesn't appear to be related to notifies being sent to replicas.
The issue seems to be wi... - 12:48 PM Backport #45890 (In Progress): nautilus: osd: pg stuck in waitactingchange when new acting set do...
- 11:58 AM Backport #45890 (Resolved): nautilus: osd: pg stuck in waitactingchange when new acting set doesn...
- https://github.com/ceph/ceph/pull/35389
- 12:44 PM Backport #45883 (In Progress): nautilus: osd-scrub-repair.sh: SyntaxError: invalid syntax
- 11:55 AM Backport #45883 (Resolved): nautilus: osd-scrub-repair.sh: SyntaxError: invalid syntax
- https://github.com/ceph/ceph/pull/35388
- 12:44 PM Backport #45780 (In Progress): nautilus: rados/test_envlibrados_for_rocksdb.sh build failure (see...
- 12:43 PM Backport #45776 (In Progress): nautilus: build_incremental_map_msg missing incremental map while ...
- 11:59 AM Backport #45892 (Rejected): mimic: osd: pg stuck in waitactingchange when new acting set doesn't ...
- https://github.com/ceph/ceph/pull/35484
- 11:59 AM Backport #45891 (Rejected): luminous: osd: pg stuck in waitactingchange when new acting set doesn...
- https://github.com/ceph/ceph/pull/35485
- 11:55 AM Backport #45884 (Resolved): octopus: osd-scrub-repair.sh: SyntaxError: invalid syntax
- https://github.com/ceph/ceph/pull/35445
- 11:55 AM Backport #45882 (Resolved): octopus: Objecter: don't attempt to read from non-primary on EC pools
- https://github.com/ceph/ceph/pull/35444
- 07:16 AM Bug #45871 (New): Incorrect (0) number of slow requests in health check
- ceph version 14.2.9-899-gc02349c600 (c02349c60052aaa6c7bd0c2270c7f7be16fab632) nautilus (stable)
Our cluster shows... - 12:24 AM Bug #40117 (Duplicate): PG stuck in WaitActingChange
- Fixed in https://tracker.ceph.com/issues/41190
- 12:21 AM Bug #41190 (Pending Backport): osd: pg stuck in waitactingchange when new acting set doesn't change
- 12:20 AM Bug #41236 (Resolved): cosbench failures in rados/perf
- 12:18 AM Bug #41550 (Resolved): os/bluestore: fadvise_flag leak in generate_transaction
- 12:17 AM Bug #41677 (Resolved): Cephmon:fix mon crash
- Fixed as a part of https://tracker.ceph.com/issues/41680.
- 12:14 AM Bug #41913 (Resolved): With auto scaler operating stopping an OSD can lead to COT crashing instea...
- 12:08 AM Bug #45356 (Resolved): nautilus: rados/upgrade/mimic-x-singleton failures due to mon_client_direc...
06/03/2020
- 09:06 PM Bug #45733 (Pending Backport): osd-scrub-repair.sh: SyntaxError: invalid syntax
- 06:12 PM Bug #45733: osd-scrub-repair.sh: SyntaxError: invalid syntax
- https://github.com/ceph/ceph/pull/35279 merged
- 08:50 PM Backport #45357: octopus: rados: Sharded OpWQ drops suicide_grace after waiting for work
- Dan Hill wrote:
> https://github.com/ceph/ceph/pull/34881
merged - 08:34 PM Bug #45868 (Resolved): rados_api_tests: LibRadosWatchNotify.AioWatchNotify2 fails
- ...
- 08:30 PM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
- /a/yuriw-2020-06-02_15:07:59-rados-wip-yuri7-testing-2020-06-01-2256-octopus-distro-basic-smithi/5113082 - octopus
- 04:44 AM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
- Moving this since it appears to be a problem with the mon_thrasher (or the MONs or monclients)....
- 02:44 PM Bug #45793 (Pending Backport): Objecter: don't attempt to read from non-primary on EC pools
- 01:24 PM Backport #41533: mimic: Move bluefs alloc size initialization log message to log level 1
- This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30219
m... - 12:59 PM Bug #45857 (New): crimson/alien_store: alienstore cannot open_collections
- setup: setting debug level 20 for bluestore, filestore and osd and using seastar with seastar_default_allocator + Rel...
- 01:50 AM Bug #9984: lttng_probe_unregister hangs on shutdown
- /a/yuriw-2020-05-30_02:18:17-rados-wip-yuri-master_5.29.20-distro-basic-smithi/5104372
Possibly an instance of thi...
Also available in: Atom