Project

General

Profile

Activity

From 06/03/2020 to 07/02/2020

07/02/2020

04:46 PM Bug #46285 (Rejected): osd: error from smartctl is always reported as invalid JSON
turns out the report was from an earlier version (it did not contain the 'output' key) Josh Durgin
04:37 PM Bug #46179 (Duplicate): Health check failed: Reduced data availability: PG_AVAILABILITY
Neha Ojha
04:36 PM Bug #46225 (Duplicate): Health check failed: 1 osds down (OSD_DOWN)
Neha Ojha
01:35 PM Bug #46264: mon: check for mismatched daemon versions
I have completed a function called check_daemon_version located in src/mon/Monitor.cc This function goes through mon_... Tyler Sheehan
09:48 AM Bug #44755 (Pending Backport): Create stronger affinity between drivegroup specs and osd daemons
Sebastian Wagner
09:04 AM Bug #46178 (Duplicate): slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+know...
Ilya Dryomov
08:56 AM Bug #46180 (Resolved): qa: Scrubbing terminated -- not all pgs were active and clean.
Will be cherry-picked into https://github.com/ceph/ceph/pull/35720 and https://github.com/ceph/ceph/pull/35733. Ilya Dryomov

07/01/2020

10:55 PM Bug #46325 (Rejected): A pool at size 3 should have a min_size 2

The get_osd_pool_default_min_size() calculation of size - size/2 for the min_size should special case size 3 and ju...
David Zafman
10:03 PM Bug #37509 (Can't reproduce): require past_interval bounds mismatch due to osd oldest_map
Neha Ojha
09:58 PM Bug #23879 (Can't reproduce): test_mon_osdmap_prune.sh fails
Neha Ojha
09:57 PM Bug #23857 (Can't reproduce): flush (manifest) vs async recovery causes out of order op
Neha Ojha
09:56 PM Bug #23828 (Can't reproduce): ec gen object leaks into different filestore collection just after ...
Neha Ojha
09:53 PM Bug #23117: PGs stuck in "activating" after osd_max_pg_per_osd_hard_ratio has been exceeded once
We should try to make it more obvious when this limit is hit. I thought we added something in the cluster logs about ... Neha Ojha
09:49 PM Documentation #46324 (New): Sepia VPN Client Access documentation is out-of-date
https://wiki.sepia.ceph.com/doku.php?id=vpnaccess#vpn_client_access
There are two issues that I noticed that must ...
Zac Dover
09:49 PM Bug #20960 (Can't reproduce): ceph_test_rados: mismatched version (due to pg import/export)
The thrash_cache_writeback_proxy_none failure has a different root cause, opened a new tracker for it https://tracker... Neha Ojha
09:47 PM Bug #46323 (Resolved): thrash_cache_writeback_proxy_none: FAILED ceph_assert(version == old_value...
... Neha Ojha
09:35 PM Bug #19700 (Closed): OSD remained up despite cluster network being inactive?
Please reopen this bug if the issue is seen in nautilus or newer releases. Neha Ojha
09:22 PM Bug #43882 (Can't reproduce): osd to mon connection lost, osd stuck down
Neha Ojha
09:16 PM Bug #44631 (Can't reproduce): ceph pg dump error code 124
Neha Ojha
07:58 PM Bug #46275: Cancellation of on-going scrubs
We may be able to easily terminate scrubbing in between chunks if the noscrub/nodeep-scrub get set.
I will test this.
David Zafman
07:56 PM Bug #46275 (In Progress): Cancellation of on-going scrubs
David Zafman
07:32 PM Backport #46095 (Resolved): octopus: Issue health status warning if num_shards_repaired exceeds s...
Josh Durgin
07:22 PM Backport #46115: octopus: Add statfs output to ceph-objectstore-tool
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35715
merged
Yuri Weinstein
06:00 PM Bug #46318 (Need More Info): mon_recovery: quorum_status times out
... Neha Ojha
05:05 PM Bug #46285: osd: error from smartctl is always reported as invalid JSON
Which version is this cluster running?
I would expect to see this "output" key in the command's output:
https://g...
Yaarit Hatuka
02:43 AM Bug #46285 (Rejected): osd: error from smartctl is always reported as invalid JSON
When smartctl returns an error, the osd always reports it as invalid json. We meant to give a better error, but the c... Josh Durgin
02:51 AM Backport #46287 (Rejected): nautilus: mon: log entry with garbage generated by bad memory access
Patrick Donnelly
02:51 AM Backport #46286 (Resolved): octopus: mon: log entry with garbage generated by bad memory access
https://github.com/ceph/ceph/pull/36035 Patrick Donnelly

06/30/2020

09:27 PM Bug #46222 (Won't Fix): Cbt installation task for cosbench fails.
The root cause of this issue is that we put an older version of cosbench in https://drop.ceph.com/qa/ after the recen... Neha Ojha
01:07 PM Bug #46222: Cbt installation task for cosbench fails.

http://qa-proxy.ceph.com/teuthology/ideepika-2020-06-29_08:23:54-rados-wip-deepika-testing-2020-06-25-2058-distro-b...
Deepika Upadhyay
05:37 PM Bug #46216 (Pending Backport): mon: log entry with garbage generated by bad memory access
Patrick Donnelly
04:41 PM Bug #46216 (Fix Under Review): mon: log entry with garbage generated by bad memory access
Neha Ojha
04:23 PM Documentation #46279 (New): various matters related to ceph mon and orch cephadm -- this is sever...
<andyg5> Hi, I am trying to move the MONitors over tothe public network, and I'm not sure how to do it. I have setu... Zac Dover
03:07 PM Bug #46224 (Resolved): Health check failed: 4 mgr modules have failed (MGR_MODULE_ERROR)
Neha Ojha
01:52 PM Bug #46275 (Resolved): Cancellation of on-going scrubs
Although it's possible to prevent initiating new scrubs, we don't have a facility for terminating already on-going on... Radoslaw Zarzynski
08:30 AM Bug #46264: mon: check for mismatched daemon versions
Hm. what do yo expect? Upgrade scenarios can become complicated with more than two versions running at the same time ... Sebastian Wagner

06/29/2020

09:22 PM Bug #46266 (Need More Info): Monitor crashed in creating pool in CrushTester::test_with_fork()
Hi. I was creating a new pool and one of my monitors crashed.... Seena Fallah
06:44 PM Bug #43553: mon: client mon_status fails
/ceph/teuthology-archive/yuriw-2020-06-25_22:31:00-fs-octopus-distro-basic-smithi/5180260/teuthology.log Patrick Donnelly
06:10 PM Bug #46264 (Resolved): mon: check for mismatched daemon versions
There is currently no test to check if the daemon are all running the same version of ceph Tyler Sheehan
05:44 PM Bug #20960: ceph_test_rados: mismatched version (due to pg import/export)
/a/dis-2020-06-28_18:43:20-rados-wip-msgr21-fix-reuse-rebuildci-distro-basic-smithi/5186890 Neha Ojha
05:36 PM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
/a/dis-2020-06-28_18:43:20-rados-wip-msgr21-fix-reuse-rebuildci-distro-basic-smithi/5186759 Neha Ojha
05:02 PM Backport #46262 (Resolved): nautilus: larger osd_scrub_max_preemptions values cause Floating poin...
https://github.com/ceph/ceph/pull/37470 Nathan Cutler
05:01 PM Backport #46261 (Resolved): octopus: larger osd_scrub_max_preemptions values cause Floating point...
https://github.com/ceph/ceph/pull/36034 Nathan Cutler
12:26 PM Bug #46178: slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+known_if_redirec...
https://pulpito.ceph.com/swagner-2020-06-29_09:26:42-rados:cephadm-wip-swagner-testing-2020-06-26-1524-distro-basic-s... Sebastian Wagner
08:53 AM Bug #44352: pool listings are slow after deleting objects
This was on the latest nautilus release at the time, the DB should have been on SSD but I don't remember. But good po... Paul Emmerich
08:50 AM Bug #45381: unfound objects in erasure-coded CephFS
No, this setup is luckily without any cache tiering. It's a completely standard setup with replicated cephfs_metadata... Paul Emmerich

06/28/2020

10:45 AM Bug #46180 (Fix Under Review): qa: Scrubbing terminated -- not all pgs were active and clean.
Ilya Dryomov
05:17 AM Bug #46024 (Pending Backport): larger osd_scrub_max_preemptions values cause Floating point excep...
xie xingguo

06/27/2020

04:20 PM Bug #46242 (New): rados -p default.rgw.buckets.data returning over millions objects No such file ...
Hi Dev,
Due sharding / s3 bugs we synced the bucket of customer to new ones.
Once we tryed to delete we're unab...
Manuel Rios
03:15 PM Bug #44595: cache tiering: Error: oid 48 copy_from 493 returned error code -2
/a/kchai-2020-06-27_07:37:00-rados-wip-kefu-testing-2020-06-27-1407-distro-basic-smithi/5183671/ Kefu Chai
08:25 AM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
... Kefu Chai

06/26/2020

07:27 PM Bug #46178: slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+known_if_redirec...
http://pulpito.ceph.com/mgfritch-2020-06-26_02:07:27-rados-wip-mgfritch-testing-2020-06-25-1855-distro-basic-smithi/... Michael Fritch
06:21 PM Bug #46180: qa: Scrubbing terminated -- not all pgs were active and clean.
Here's a reliable reproducer for the issue:
-s rados/singleton-nomsgr -c master --filter 'all/health-warnings rado...
Neha Ojha
06:50 AM Bug #46180: qa: Scrubbing terminated -- not all pgs were active and clean.
I think it has to do with reconnect handling and how connections are reused.
This part of ProtocolV2 is pretty fra...
Ilya Dryomov
05:04 AM Bug #46180: qa: Scrubbing terminated -- not all pgs were active and clean.
This is a msgr2.1 issue.
Ilya Dryomov
05:48 PM Bug #46225 (Triaged): Health check failed: 1 osds down (OSD_DOWN)
Neha Ojha
05:39 PM Bug #46225: Health check failed: 1 osds down (OSD_DOWN)
Also, related to https://tracker.ceph.com/issues/46180... Neha Ojha
10:57 AM Bug #46225 (Duplicate): Health check failed: 1 osds down (OSD_DOWN)
/a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/5176410
2020-06-2...
Sridhar Seshasayee
05:34 PM Bug #46227 (Duplicate): Segmentation fault when running ceph_test_keyvaluedb command as part of a...
Duplicate of https://tracker.ceph.com/issues/46054 Neha Ojha
11:19 AM Bug #46227 (Duplicate): Segmentation fault when running ceph_test_keyvaluedb command as part of a...
/a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/5176446
Unfortunate...
Sridhar Seshasayee
05:31 PM Bug #46179 (Triaged): Health check failed: Reduced data availability: PG_AVAILABILITY
Neha Ojha
05:11 PM Bug #46179: Health check failed: Reduced data availability: PG_AVAILABILITY
This failure is different from the one seen in the RGW suite earlier due to upmap. This is related to https://tracker... Neha Ojha
07:32 AM Bug #46179: Health check failed: Reduced data availability: PG_AVAILABILITY
/a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/
job ID: 5176200
F...
Sridhar Seshasayee
05:31 PM Bug #46224 (Fix Under Review): Health check failed: 4 mgr modules have failed (MGR_MODULE_ERROR)
Neha Ojha
10:44 AM Bug #46224 (Resolved): Health check failed: 4 mgr modules have failed (MGR_MODULE_ERROR)
/a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/5176341 and
/a/ssesha...
Sridhar Seshasayee
05:30 PM Bug #46222 (In Progress): Cbt installation task for cosbench fails.
Neha Ojha
09:03 AM Bug #46222: Cbt installation task for cosbench fails.
See /a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/5176322 as well Sridhar Seshasayee
09:00 AM Bug #46222 (Won't Fix): Cbt installation task for cosbench fails.
/a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/5176309
2020-06-2...
Sridhar Seshasayee
04:48 PM Feature #46238 (New): raise a HEALTH warn, if OSDs use the cluster_network for the front
Related to: https://tracker.ceph.com/issues/46230 Michal Nasiadka
12:17 PM Backport #46229 (In Progress): octopus: Ceph Monitor heartbeat grace period does not reset.
Sridhar Seshasayee
12:14 PM Backport #46229 (New): octopus: Ceph Monitor heartbeat grace period does not reset.
Sridhar Seshasayee
11:48 AM Backport #46229 (Resolved): octopus: Ceph Monitor heartbeat grace period does not reset.
https://github.com/ceph/ceph/pull/35799 Sridhar Seshasayee
12:13 PM Backport #46228 (In Progress): nautilus: Ceph Monitor heartbeat grace period does not reset.
Sridhar Seshasayee
12:13 PM Backport #46228 (New): nautilus: Ceph Monitor heartbeat grace period does not reset.
Sridhar Seshasayee
11:47 AM Backport #46228 (Resolved): nautilus: Ceph Monitor heartbeat grace period does not reset.
https://github.com/ceph/ceph/pull/35798 Sridhar Seshasayee
11:43 AM Bug #45943 (Pending Backport): Ceph Monitor heartbeat grace period does not reset.
Sridhar Seshasayee
11:14 AM Documentation #46203 (Resolved): docs.ceph.com is down
docs.ceph.com returned four hours later. Zac Dover
08:40 AM Bug #24057: cbt fails to copy results to the archive dir
Observed the issue during this run:
/a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distr...
Sridhar Seshasayee
07:28 AM Bug #44595: cache tiering: Error: oid 48 copy_from 493 returned error code -2
/a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/
job ID: 5176184
...
Sridhar Seshasayee
07:18 AM Bug #45441: rados: Health check failed: 1/3 mons down, quorum a,c (MON_DOWN)" in cluster log'
Observing the issue during this run:
/a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-dist...
Sridhar Seshasayee
04:38 AM Bug #46125: ceph mon memory increasing
I will try with default settings for the monitor. With current config file parameters, the monitor is using 1GB.
I...
Ashish Nagar

06/25/2020

11:56 PM Bug #46216 (Resolved): mon: log entry with garbage generated by bad memory access
Causes the mgr to segmentation fault:... Patrick Donnelly
10:27 PM Bug #46178: slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+known_if_redirec...
/a/yuvalif-2020-06-23_14:40:15-rgw-wip-yuval-test-35331-35155-distro-basic-smithi/5173465
Seems very likely to hav...
Neha Ojha
09:09 PM Bug #46125 (Need More Info): ceph mon memory increasing
Can you try with the default settings for the monitor? What level of memory usage are you seeing exactly?
There is...
Josh Durgin
07:40 PM Bug #46180: qa: Scrubbing terminated -- not all pgs were active and clean.
The common thing in all of these is that the tests are all failing while running the ceph task, no thrashing or anyth... Neha Ojha
03:15 PM Bug #46180: qa: Scrubbing terminated -- not all pgs were active and clean.
Saw the same error during this run:
http://pulpito.ceph.com/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-...
Sridhar Seshasayee
05:29 PM Bug #46211 (Duplicate): qa: pools stuck in creating
Patrick Donnelly
05:26 PM Bug #46211 (Duplicate): qa: pools stuck in creating
During cluster setup for the CephFS suites, we see this failure:... Patrick Donnelly
03:44 PM Bug #39039: mon connection reset, command not resent
Hitting this issue on octopus, Fedora 32:... Sunny Kumar
02:18 PM Documentation #46203 (In Progress): docs.ceph.com is down
I'm afraid this is outside my control. We're at the mercy of our cloud provider. Pretty sure it's this: http://trav... David Galloway
07:49 AM Documentation #46203 (Resolved): docs.ceph.com is down
docs.ceph.com has been down since at the latest 1735 aest 25 Jun 2020.
https://downforeveryoneorjustme.com/docs.ce...
Zac Dover

06/24/2020

02:16 PM Bug #46180 (Resolved): qa: Scrubbing terminated -- not all pgs were active and clean.
Seeing several test failures in the rgw suite:... Casey Bodley
02:09 PM Bug #46179 (Duplicate): Health check failed: Reduced data availability: PG_AVAILABILITY
multiple RGW tests are failing on different branches, with:... Casey Bodley
01:20 PM Bug #46178: slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+known_if_redirec...
http://pulpito.ceph.com/swagner-2020-06-24_11:30:44-rados:cephadm-wip-swagner3-testing-2020-06-24-1025-distro-basic-s... Sebastian Wagner
01:19 PM Bug #46178: slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+known_if_redirec...
http://pulpito.ceph.com/swagner-2020-06-24_11:30:44-rados:cephadm-wip-swagner3-testing-2020-06-24-1025-distro-basic-s... Sebastian Wagner
01:16 PM Bug #46178: slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+known_if_redirec...
http://pulpito.ceph.com/swagner-2020-06-24_11:30:44-rados:cephadm-wip-swagner3-testing-2020-06-24-1025-distro-basic-s... Sebastian Wagner
12:57 PM Bug #46178 (Duplicate): slow request osd_op(... (undecoded) ondisk+retry+read+ignore_overlay+know...
Saw this error yesterday for the first time:
http://pulpito.ceph.com/swagner-2020-06-23_13:15:09-rados:cephadm-wip...
Sebastian Wagner
10:37 AM Backport #45676 (Resolved): octopus: rados/test_envlibrados_for_rocksdb.sh fails on Xenial (seen ...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35236
m...
Nathan Cutler
02:47 AM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
... Kefu Chai
01:36 AM Backport #46164 (In Progress): nautilus: osd: make message cap option usable again
Neha Ojha
01:13 AM Backport #46164 (Resolved): nautilus: osd: make message cap option usable again
https://github.com/ceph/ceph/pull/35738 Neha Ojha
01:28 AM Backport #46165 (In Progress): octopus: osd: make message cap option usable again
Neha Ojha
01:13 AM Backport #46165 (Resolved): octopus: osd: make message cap option usable again
https://github.com/ceph/ceph/pull/35737 Neha Ojha
12:18 AM Bug #46143 (Pending Backport): osd: make message cap option usable again
Neha Ojha

06/23/2020

08:11 PM Backport #45676: octopus: rados/test_envlibrados_for_rocksdb.sh fails on Xenial (seen in nautilus)
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35236
merged
Yuri Weinstein
12:15 AM Bug #45944: osd/osd-markdown.sh: TEST_osd_stop failed
... Neha Ojha

06/22/2020

09:59 PM Backport #46115 (In Progress): octopus: Add statfs output to ceph-objectstore-tool
David Zafman
09:37 PM Backport #46116 (In Progress): nautilus: Add statfs output to ceph-objectstore-tool
David Zafman
06:52 PM Bug #45944: osd/osd-markdown.sh: TEST_osd_stop failed
/a/teuthology-2020-06-19_07:01:02-rados-master-distro-basic-smithi/5164221 Neha Ojha
05:53 PM Bug #46143 (Fix Under Review): osd: make message cap option usable again
Neha Ojha
05:36 PM Bug #46143 (In Progress): osd: make message cap option usable again
Neha Ojha
05:18 PM Bug #46143 (Resolved): osd: make message cap option usable again
"This reverts commit 45d5ac3.
Without a msg throttler, we can't change osd_client_message_cap cap
online. The thr...
Neha Ojha
04:57 PM Bug #41154: osd: pg unknown state
I again have this problem.... Alexander Kazansky
03:19 PM Documentation #46141 (New): Document automatic OSD deployment behavior better
Make certain that the documentation notifies readers that OSDs are automatically created, so that they are not caught... Zac Dover
09:12 AM Bug #46137: Monitor leader is marking multiple osd's down
Every few mins multiple osd's are going down and coming back up which is causing recovery of data, This is occurring ... Prayank Saxena
09:07 AM Bug #46137 (New): Monitor leader is marking multiple osd's down
My ceph cluster consist of 5 Mon and 58 DN with 1302 total osd's (HDD's) with 12.2.8 Luminous (stable) version and Fi... Prayank Saxena
06:02 AM Bug #45943: Ceph Monitor heartbeat grace period does not reset.
Updates from testing the fix:
OSD failure before being marked down:...
Sridhar Seshasayee

06/21/2020

02:17 PM Feature #24099: osd: Improve workflow when creating OSD on raw block device if there was bluestor...
John Spray wrote:
> This seems like an odd idea -- if someone is doing OSD creation by hand, why would they want to ...
Niklas Hambuechen
12:25 PM Documentation #46099: document statfs operation for ceph-objectstore-tool

if (op == "statfs") {
store_statfs_t statsbuf;
ret = fs->statfs(&statsbuf);
if (ret < 0) {
...
Zac Dover
12:10 PM Documentation #46126 (New): RGW docs lack an explanation of how permissions management works, esp...
<dirtwash> you know its sshitty protocol and design if obvious things arent visible and default behavior doesnt work
...
Zac Dover
08:02 AM Bug #46125: ceph mon memory increasing
Hi,
I have deployed ceph single node cluster.
ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) na...
Ashish Nagar
07:13 AM Bug #46125 (Need More Info): ceph mon memory increasing
Hi,
I have deployed ceph single node cluster.
ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) ...
Ashish Nagar

06/20/2020

10:12 PM Backport #46096 (In Progress): nautilus: Issue health status warning if num_shards_repaired excee...
Nathan Cutler
10:09 PM Backport #46095 (In Progress): octopus: Issue health status warning if num_shards_repaired exceed...
Nathan Cutler
09:57 PM Bug #45793 (Resolved): Objecter: don't attempt to read from non-primary on EC pools
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
09:56 PM Backport #45882 (Resolved): octopus: Objecter: don't attempt to read from non-primary on EC pools
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35444
m...
Nathan Cutler
09:56 PM Backport #45775 (Resolved): octopus: build_incremental_map_msg missing incremental map while snap...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35442
m...
Nathan Cutler
07:59 AM Documentation #46120 (Resolved): Improve ceph-objectstore-tool documentation
https://github.com/ceph/ceph/pull/33823
There are a number of comments by David Zafman that I failed to include in...
Zac Dover
04:20 AM Bug #46065 (Resolved): sudo missing from command in monitor-bootstrapping procedure
Zac Dover

06/19/2020

04:36 PM Backport #46116 (Resolved): nautilus: Add statfs output to ceph-objectstore-tool
https://github.com/ceph/ceph/pull/35713 Nathan Cutler
04:36 PM Backport #46115 (Resolved): octopus: Add statfs output to ceph-objectstore-tool
https://github.com/ceph/ceph/pull/35715 Nathan Cutler
05:00 AM Documentation #46099 (New): document statfs operation for ceph-objectstore-tool
https://github.com/ceph/ceph/pull/35632
https://github.com/ceph/ceph/pull/33823
The affected file (I think) is ...
Zac Dover

06/18/2020

11:26 PM Bug #46064 (Pending Backport): Add statfs output to ceph-objectstore-tool
David Zafman
01:13 AM Bug #46064 (Fix Under Review): Add statfs output to ceph-objectstore-tool
David Zafman
01:08 AM Bug #46064 (In Progress): Add statfs output to ceph-objectstore-tool
David Zafman
01:07 AM Bug #46064 (Resolved): Add statfs output to ceph-objectstore-tool

This will help diagnose out of space crashes:...
David Zafman
10:32 PM Backport #45882: octopus: Objecter: don't attempt to read from non-primary on EC pools
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35444
merged
Yuri Weinstein
10:31 PM Backport #45775: octopus: build_incremental_map_msg missing incremental map while snaptrim or bac...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35442
merged
Yuri Weinstein
08:08 PM Backport #46096 (Resolved): nautilus: Issue health status warning if num_shards_repaired exceeds ...
https://github.com/ceph/ceph/pull/36379 Patrick Donnelly
08:08 PM Backport #46095 (Resolved): octopus: Issue health status warning if num_shards_repaired exceeds s...
https://github.com/ceph/ceph/pull/35685 Patrick Donnelly
08:06 PM Backport #46090 (Resolved): nautilus: PG merge: FAILED ceph_assert(info.history.same_interval_sin...
https://github.com/ceph/ceph/pull/36161 Patrick Donnelly
08:06 PM Backport #46089 (Resolved): octopus: PG merge: FAILED ceph_assert(info.history.same_interval_sinc...
https://github.com/ceph/ceph/pull/36033 Patrick Donnelly
08:06 PM Backport #46086 (Resolved): octopus: osd: wakeup all threads of shard rather than one thread
https://github.com/ceph/ceph/pull/36032 Patrick Donnelly
10:40 AM Bug #46071 (New): potential rocksdb failure: few osd's service not starting up after node reboot....
Data node went down abruptly due to issue with SPS-BD Smart Array PCIe SAS Expander, once hardware was changed node c... Prayank Saxena
03:30 AM Bug #46065 (Fix Under Review): sudo missing from command in monitor-bootstrapping procedure
https://github.com/ceph/ceph/pull/35635 Zac Dover
03:25 AM Bug #46065 (Resolved): sudo missing from command in monitor-bootstrapping procedure
Where:
https://docs.ceph.com/docs/master/install/manual-deployment/#monitor-bootstrapping
What:
<badone> https:/...
Zac Dover

06/17/2020

09:21 PM Bug #45991 (Pending Backport): PG merge: FAILED ceph_assert(info.history.same_interval_since != 0)
Neha Ojha
11:42 AM Bug #45991 (Fix Under Review): PG merge: FAILED ceph_assert(info.history.same_interval_since != 0)
Kefu Chai
09:19 PM Bug #46024 (Fix Under Review): larger osd_scrub_max_preemptions values cause Floating point excep...
Neha Ojha
09:19 PM Bug #46043 (Need More Info): osd/ECBackend.cc: 1551: FAILED assert(!(*m).is_missing(hoid))
It is really hard to say what caused this assert without enough debug logging and I doubt we will able to reproduce t... Neha Ojha
07:41 AM Bug #46043 (Need More Info): osd/ECBackend.cc: 1551: FAILED assert(!(*m).is_missing(hoid))
We observed this crush on on one of the customer servers:... Mykola Golub
05:22 PM Feature #41564 (Pending Backport): Issue health status warning if num_shards_repaired exceeds som...
David Zafman
03:35 PM Bug #46053 (Resolved): osd: wakeup all threads of shard rather than one thread
Neha Ojha

06/16/2020

01:52 AM Bug #46024 (Resolved): larger osd_scrub_max_preemptions values cause Floating point exception

A non-default large osd_scrub_max_preemptions value (e.g., 32) would cause scrubber.preempt_divisor underflow and...
xie xingguo

06/15/2020

07:25 PM Backport #46018 (Resolved): octopus: ceph_test_rados_watch_notify hang
Nathan Cutler
07:25 PM Backport #46017 (Resolved): nautilus: ceph_test_rados_watch_notify hang
https://github.com/ceph/ceph/pull/36031 Nathan Cutler
07:24 PM Backport #46016 (Resolved): octopus: osd-backfill-stats.sh failing intermittently in TEST_backfil...
https://github.com/ceph/ceph/pull/36030 Nathan Cutler
07:22 PM Bug #45612 (Resolved): qa: powercycle: install task runs twice with double unwind causing fatal e...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
07:21 PM Backport #46007 (Resolved): octopus: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recover...
https://github.com/ceph/ceph/pull/36029 Nathan Cutler

06/13/2020

05:26 AM Bug #45991 (Resolved): PG merge: FAILED ceph_assert(info.history.same_interval_since != 0)
http://qa-proxy.ceph.com/teuthology/xxg-2020-06-13_00:34:59-rados:thrash-wip-nautilus-nnnn-distro-basic-smithi/514318... xie xingguo

06/12/2020

02:50 PM Backport #45884: octopus: osd-scrub-repair.sh: SyntaxError: invalid syntax
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35445
m...
Nathan Cutler
03:50 AM Backport #45884 (Resolved): octopus: osd-scrub-repair.sh: SyntaxError: invalid syntax
Brad Hubbard
12:31 AM Backport #45884: octopus: osd-scrub-repair.sh: SyntaxError: invalid syntax
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35445
merged
Yuri Weinstein
02:50 PM Backport #45779: octopus: rados/test_envlibrados_for_rocksdb.sh build failure (seen in nautilus)
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35443
m...
Nathan Cutler
03:47 AM Backport #45779 (Resolved): octopus: rados/test_envlibrados_for_rocksdb.sh build failure (seen in...
Brad Hubbard
12:30 AM Backport #45779: octopus: rados/test_envlibrados_for_rocksdb.sh build failure (seen in nautilus)
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35443
merged
Yuri Weinstein
02:49 PM Backport #45673 (Resolved): octopus: qa: powercycle: install task runs twice with double unwind c...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35441
m...
Nathan Cutler
12:30 AM Backport #45673: octopus: qa: powercycle: install task runs twice with double unwind causing fata...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35441
merged
Yuri Weinstein
09:33 AM Documentation #45988: [doc/os]: Centos 8 is not listed even though it is supported
I confirm that there is a row in this table that mentions Centos 8, and that this line appears when I build the docs ... Zac Dover
09:24 AM Documentation #45988 (Resolved): [doc/os]: Centos 8 is not listed even though it is supported
19
https://docs.ceph.com/docs/master/releases/octopus/
https://docs.ceph.com/docs/octopus/start/os-recommendations/...
Zac Dover

06/11/2020

05:23 PM Backport #45780: nautilus: rados/test_envlibrados_for_rocksdb.sh build failure (seen in nautilus)
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/35387
m...
Nathan Cutler
01:26 AM Bug #45795 (Pending Backport): PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_back...
Kefu Chai
01:21 AM Bug #38357: ClsLock.TestExclusiveEphemeralStealEphemeral failed
... Kefu Chai

06/10/2020

09:30 PM Bug #45916 (Fix Under Review): cls_lock: unlimited shared lock created by libradosstriper api let...
Neha Ojha
09:25 PM Bug #43861 (Pending Backport): ceph_test_rados_watch_notify hang
Let's remove these tests from the stable branches too. Josh Durgin
09:02 AM Feature #41564 (In Progress): Issue health status warning if num_shards_repaired exceeds some thr...
David Zafman
12:25 AM Bug #44314 (Pending Backport): osd-backfill-stats.sh failing intermittently in TEST_backfill_size...
David Zafman

06/09/2020

09:34 PM Backport #45780 (Resolved): nautilus: rados/test_envlibrados_for_rocksdb.sh build failure (seen i...
Brad Hubbard
02:58 PM Backport #45780: nautilus: rados/test_envlibrados_for_rocksdb.sh build failure (seen in nautilus)
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/35387
merged
Yuri Weinstein
09:02 PM Bug #42716: Pool creation error message is hidden on FileStore-backed pools
That wasn't the initial issue reported.
What happen if you run "ceph osd pool create foo2 2048" instead ? (assumin...
Dimitri Savineau
07:38 PM Bug #42716 (Resolved): Pool creation error message is hidden on FileStore-backed pools
closing this as already resolved.... Deepika Upadhyay
02:41 PM Bug #36337: OSDs crash with failed assertion in PGLog::merge_log as logs do not overlap
... Neha Ojha
02:41 PM Bug #45956 (New): verify takes forever to finish
rados/verify/{centos_latest.yaml ceph.yaml clusters/{fixed-2.yaml openstack.yaml} d-thrash/default/{default.yaml thra... Kefu Chai
12:24 PM Bug #45661 (Resolved): valgrind issue: UninitValue in ProtocolV2
In @master@ the PR #35407 has been closed in favor of https://github.com/ceph/ceph/pull/35186.
#35407 still might be...
Radoslaw Zarzynski
06:34 AM Bug #45948 (Duplicate): ceph_test_rados_delete_pools_parallel failed with error -2 on nautilus
Oops, this is a dup of #43887 Brad Hubbard
06:31 AM Bug #45948 (Duplicate): ceph_test_rados_delete_pools_parallel failed with error -2 on nautilus
/a/yuriw-2020-06-08_16:06:08-rados-wip-yuri2-testing-2020-06-08-1458-nautilus-distro-basic-smithi/5129541... Brad Hubbard
06:06 AM Bug #45947: ceph_test_rados_watch_notify hang seen in nautilus
Note https://tracker.ceph.com/issues/43861 removed this test from master because it was hanging. Brad Hubbard
06:02 AM Bug #45947: ceph_test_rados_watch_notify hang seen in nautilus
This is very similar to what is seen in #45946 so they may be related. Brad Hubbard
06:01 AM Bug #45947 (New): ceph_test_rados_watch_notify hang seen in nautilus
/a/yuriw-2020-06-08_16:06:08-rados-wip-yuri2-testing-2020-06-08-1458-nautilus-distro-basic-smithi/5129565... Brad Hubbard
05:32 AM Bug #45946 (New): ceph_test_rados_delete_pools_parallel hang seen in octopus
/a/yuriw-2020-05-29_15:51:00-rados-wip-yuri-testing-2020-05-28-2238-octopus-distro-basic-smithi/5103106... Brad Hubbard
04:28 AM Bug #20960: ceph_test_rados: mismatched version (due to pg import/export)
... Kefu Chai
12:05 AM Bug #44510: osd/osd-recovery-space.sh TEST_recovery_test_simple failure
Seen again:
http://pulpito.ceph.com/dzafman-2020-06-08_11:45:40-rados-wip-zafman-testing-distro-basic-smithi/5130114
David Zafman

06/08/2020

11:51 PM Bug #43888: osd/osd-bench.sh 'tell osd.N bench' hang
Saw this in at least 17 jobs:
http://pulpito.ceph.com/dzafman-2020-06-08_11:45:40-rados-wip-zafman-testing-distro-...
David Zafman
11:39 PM Bug #45944 (Triaged): osd/osd-markdown.sh: TEST_osd_stop failed
This appears to be a rare condition when 15 seconds sleep was not enough. Neha Ojha
09:14 PM Bug #45944 (Triaged): osd/osd-markdown.sh: TEST_osd_stop failed
... Neha Ojha
09:10 PM Bug #45318: Health check failed: 2/6 mons down, quorum b,a,c,e (MON_DOWN)" in cluster log running...
rados/multimon/{clusters/21 msgr-failures/few msgr/async-v1only no_pools objectstore/bluestore-comp-zlib rados suppor... Neha Ojha
07:39 PM Bug #45943 (Fix Under Review): Ceph Monitor heartbeat grace period does not reset.
Sridhar Seshasayee
07:09 PM Bug #45943 (Resolved): Ceph Monitor heartbeat grace period does not reset.
The heartbeat grace timer does not reset after cluster network is stable for multiple days.
Implement a mechanism to...
Sridhar Seshasayee
06:31 PM Backport #45891 (In Progress): luminous: osd: pg stuck in waitactingchange when new acting set do...
Nathan Cutler
06:22 PM Backport #45892 (In Progress): mimic: osd: pg stuck in waitactingchange when new acting set doesn...
Nathan Cutler
12:51 PM Bug #45795 (Fix Under Review): PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_back...
Ilya Dryomov
07:01 AM Bug #45916: cls_lock: unlimited shared lock created by libradosstriper api let node crash
add pr: https://github.com/ceph/ceph/pull/35467 Zhenyi Shu
06:50 AM Bug #45916 (Fix Under Review): cls_lock: unlimited shared lock created by libradosstriper api let...
_Background: Ceph liminous are running on our production and a service uses libradosstriper api to access ceph._
W...
Zhenyi Shu

06/06/2020

08:45 AM Backport #45357 (Resolved): octopus: rados: Sharded OpWQ drops suicide_grace after waiting for work
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/34881
m...
Nathan Cutler
08:31 AM Backport #45884 (In Progress): octopus: osd-scrub-repair.sh: SyntaxError: invalid syntax
Nathan Cutler
08:31 AM Backport #45882 (In Progress): octopus: Objecter: don't attempt to read from non-primary on EC pools
Nathan Cutler
08:30 AM Backport #45779 (In Progress): octopus: rados/test_envlibrados_for_rocksdb.sh build failure (seen...
Nathan Cutler
08:29 AM Backport #45775 (In Progress): octopus: build_incremental_map_msg missing incremental map while s...
Nathan Cutler
08:28 AM Backport #45673 (In Progress): octopus: qa: powercycle: install task runs twice with double unwin...
Nathan Cutler
12:53 AM Bug #44314 (In Progress): osd-backfill-stats.sh failing intermittently in TEST_backfill_sizeup_ou...
David Zafman

06/05/2020

10:52 PM Bug #44314: osd-backfill-stats.sh failing intermittently in TEST_backfill_sizeup_out() (degraded ...

It would be helpful to see the osd logs when this happens. We are expecting the following sequence to occur.
St...
David Zafman
04:20 PM Bug #45721: CommandFailedError: Command failed (workunit test rados/test_python.sh) FAIL: test_ra...
/a/yuriw-2020-06-04_18:03:48-rados-wip-yuri2-testing-2020-06-03-2341-MASTER-distro-basic-smithi/5117777 Neha Ojha
04:17 PM Bug #45424: api_watch_notify_pp: [ FAILED ] LibRadosWatchNotifyECPP.WatchNotify watch_notify_cx...
/a/yuriw-2020-06-04_18:03:48-rados-wip-yuri2-testing-2020-06-03-2341-MASTER-distro-basic-smithi/5117783 Neha Ojha
04:01 PM Bug #20960: ceph_test_rados: mismatched version (due to pg import/export)
/a/yuriw-2020-06-04_18:03:48-rados-wip-yuri2-testing-2020-06-03-2341-MASTER-distro-basic-smithi/5118028 Neha Ojha
03:58 PM Bug #44517: osd/osd-backfill-space.sh TEST_backfill_multi_partial: pgs didn't go active+clean
... Neha Ojha

06/04/2020

09:15 PM Bug #45868: rados_api_tests: LibRadosWatchNotify.AioWatchNotify2 fails
Similar... Neha Ojha
09:06 PM Bug #45661 (Fix Under Review): valgrind issue: UninitValue in ProtocolV2
https://github.com/ceph/ceph/pull/35407 Radoslaw Zarzynski
10:07 AM Bug #45661: valgrind issue: UninitValue in ProtocolV2
Pin-pointed to a branch of @PrimaryLogPG::do_manifest_flush()@:... Radoslaw Zarzynski
08:36 AM Bug #45661: valgrind issue: UninitValue in ProtocolV2
... Radoslaw Zarzynski
06:08 PM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
Ah, that makes sense. It should suffice to simply not populate_obc_watchers if replica. Samuel Just
05:42 PM Bug #45795: PrimaryLogPG.cc: 627: FAILED ceph_assert(!get_acting_recovery_backfill().empty())
After more digging, this doesn't appear to be related to notifies being sent to replicas.
The issue seems to be wi...
Ilya Dryomov
12:48 PM Backport #45890 (In Progress): nautilus: osd: pg stuck in waitactingchange when new acting set do...
Nathan Cutler
11:58 AM Backport #45890 (Resolved): nautilus: osd: pg stuck in waitactingchange when new acting set doesn...
https://github.com/ceph/ceph/pull/35389 Nathan Cutler
12:44 PM Backport #45883 (In Progress): nautilus: osd-scrub-repair.sh: SyntaxError: invalid syntax
Nathan Cutler
11:55 AM Backport #45883 (Resolved): nautilus: osd-scrub-repair.sh: SyntaxError: invalid syntax
https://github.com/ceph/ceph/pull/35388 Nathan Cutler
12:44 PM Backport #45780 (In Progress): nautilus: rados/test_envlibrados_for_rocksdb.sh build failure (see...
Nathan Cutler
12:43 PM Backport #45776 (In Progress): nautilus: build_incremental_map_msg missing incremental map while ...
Nathan Cutler
11:59 AM Backport #45892 (Rejected): mimic: osd: pg stuck in waitactingchange when new acting set doesn't ...
https://github.com/ceph/ceph/pull/35484 Nathan Cutler
11:59 AM Backport #45891 (Rejected): luminous: osd: pg stuck in waitactingchange when new acting set doesn...
https://github.com/ceph/ceph/pull/35485 Nathan Cutler
11:55 AM Backport #45884 (Resolved): octopus: osd-scrub-repair.sh: SyntaxError: invalid syntax
https://github.com/ceph/ceph/pull/35445 Nathan Cutler
11:55 AM Backport #45882 (Resolved): octopus: Objecter: don't attempt to read from non-primary on EC pools
https://github.com/ceph/ceph/pull/35444 Nathan Cutler
07:16 AM Bug #45871 (New): Incorrect (0) number of slow requests in health check
ceph version 14.2.9-899-gc02349c600 (c02349c60052aaa6c7bd0c2270c7f7be16fab632) nautilus (stable)
Our cluster shows...
Eugen Block
12:24 AM Bug #40117 (Duplicate): PG stuck in WaitActingChange
Fixed in https://tracker.ceph.com/issues/41190 Neha Ojha
12:21 AM Bug #41190 (Pending Backport): osd: pg stuck in waitactingchange when new acting set doesn't change
Neha Ojha
12:20 AM Bug #41236 (Resolved): cosbench failures in rados/perf
Neha Ojha
12:18 AM Bug #41550 (Resolved): os/bluestore: fadvise_flag leak in generate_transaction
Neha Ojha
12:17 AM Bug #41677 (Resolved): Cephmon:fix mon crash
Fixed as a part of https://tracker.ceph.com/issues/41680. Neha Ojha
12:14 AM Bug #41913 (Resolved): With auto scaler operating stopping an OSD can lead to COT crashing instea...
Neha Ojha
12:08 AM Bug #45356 (Resolved): nautilus: rados/upgrade/mimic-x-singleton failures due to mon_client_direc...
Neha Ojha

06/03/2020

09:06 PM Bug #45733 (Pending Backport): osd-scrub-repair.sh: SyntaxError: invalid syntax
Neha Ojha
06:12 PM Bug #45733: osd-scrub-repair.sh: SyntaxError: invalid syntax
https://github.com/ceph/ceph/pull/35279 merged Yuri Weinstein
08:50 PM Backport #45357: octopus: rados: Sharded OpWQ drops suicide_grace after waiting for work
Dan Hill wrote:
> https://github.com/ceph/ceph/pull/34881
merged
Yuri Weinstein
08:34 PM Bug #45868 (Resolved): rados_api_tests: LibRadosWatchNotify.AioWatchNotify2 fails
... Neha Ojha
08:30 PM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
/a/yuriw-2020-06-02_15:07:59-rados-wip-yuri7-testing-2020-06-01-2256-octopus-distro-basic-smithi/5113082 - octopus Neha Ojha
04:44 AM Bug #45761: mon_thrasher: "Error ENXIO: mon unavailable" during sync_force command leads to "fail...
Moving this since it appears to be a problem with the mon_thrasher (or the MONs or monclients).... Brad Hubbard
02:44 PM Bug #45793 (Pending Backport): Objecter: don't attempt to read from non-primary on EC pools
Kefu Chai
01:24 PM Backport #41533: mimic: Move bluefs alloc size initialization log message to log level 1
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/30219
m...
Nathan Cutler
12:59 PM Bug #45857 (New): crimson/alien_store: alienstore cannot open_collections
setup: setting debug level 20 for bluestore, filestore and osd and using seastar with seastar_default_allocator + Rel... Deepika Upadhyay
01:50 AM Bug #9984: lttng_probe_unregister hangs on shutdown
/a/yuriw-2020-05-30_02:18:17-rados-wip-yuri-master_5.29.20-distro-basic-smithi/5104372
Possibly an instance of thi...
Brad Hubbard
 

Also available in: Atom